thingful - Tumblr blog

thingful · 5 years ago

Text

On The Diversity of Search Engines for the Internet of Things

While we’ve been hard at work over the last 6 years building out our own IoT Search Engine, we recently noticed that many others are working on the “Search Engines for the Internet of Things” sphere, and that the industry is gathering steam; read on to hear about some of them.

Well, we never expected Thingful to be the only Internet of Things Search Engine – we’ve always assumed that if we’re working in a valuable space, others would soon join us. Now, 5G is here. The IoT grows and data is increasingly seen as powering the ‘pulse of the planet’. For decades, organisations have been able to use data from their internal systems and processes to improve their operations, but the real opportunity in IoT arises when people and organisations can benefit from each other's data, not just their own closed-system data – while respecting stringent privacy and security frameworks. But this is only possible when such data is ‘discoverable’ by others (again, respecting privacy & security parameters that the owner has adopted).

IoT Search Engines are becoming a fundamental part for the next phase of the Internet of Things. Shodan, IoTCrawler, Reposify, Censys, and many more Search Engines for the IoT have come into existence since the inception of Thingful.

Back in 2013/2014, Thingful was the only IoT Search Engine, unique in its focus on making the Open Internet of Things useful and practical. We have thought long and hard about the need for understanding data and how it's produced by people and their connected ‘things’, particularly concerning the physical and natural environment. Much of our work over the last few years has been to enable the same kind of discoverability for private data that is much easier on public data – building the ‘entitlement’ tools through which data owners can circumscribe how their data is found and used by others (if at all), while still making it interoperable and accessible. Read more about this in our articles 'Enabling secure discoverability & interoperability between millions of public & private IoT objects around the world', and 'Managing Privacy in the Internet of Things' (Harvard Business Review).

Discovering all these new IoT Search Engines emerge is tremendously exciting - we are at the beginning of a paradigm shift when it comes to data rights and ownership, and the Search Engines that respect these are those we believe will succeed in the long run. In a world dominated by Giant Tech companies profiting from the online presence of people, and nudging behaviours without our knowledge, our role as technologists, data scientists, and designers becomes more important than ever. However, not all is bad. As Yuval Noah Harari mentioned in this FT article, when technology gives the power to a group of people to monitor another, the same technology can be used in reverse by the surveilled party. With the right tools, people have the opportunity to choose, rather than passively follow the systems created without their consent. For these reasons, if IoT data is to be interoperable, and, to a certain extent, ‘shared’ between organisations, we must focus on rendering data accessible, transparent, and easily understandable to everyone today more than ever.

We would like to conclude this blogpost with a provocation, and a question we hope can inspire technologists, data scientists, and designers in their practice.

Data has been characterised in many different ways; a tsunami, oil, etc. Consider, for a moment, another analogy, that data is like the letters of the alphabet – invented and crafted, but with a certain commensurability given the right context. When reading ‘letters’ generated in patterns by others, people can try to make sense of and understand what they think a writer is trying to communicate. But that message can be understood in many different ways, and the ‘letters’ of that communication do not define its “veracity”. In times of uncertainty, fake news, and information overload, we are seeing the rise of authoritarianism spreading in both the digital and physical realm – through large technology companies on one hand and political structures on the other. People have to make choices and decisions about how they ‘read’, ‘interpret’ and act upon data, and this will, in part, be based on a set of perceptions and presumptions that are already in place, helping either to reinforce or challenge those interpretations, decisions, and actions. As exemplified in Data Feminism, a new book on MIT Press by Catherine D'Ignazio and Lauren F. Klein, “data is never [...] raw, truthful input, and it is never neutral”. On a side note, this book not only offers new ways of thinking about data science and data ethics but is itself the product of an innovative approach to data-driven collaboration, participation & deliberation. You may be interested to follow the reading group hosted by the authors where they discuss a chapter of the book every week.

www.elviavasconcelos.com is documenting the reading group @sketchnotes_are_awesome

Our big question is: how do we best provide the tools for people to become data literate, particularly in an age of IoT data sharing and interoperability, empowering us all to make sense of and use data in a way that is beneficial to the collective, but doesn’t just preserve inappropriate structures and ideologies?

Future blog posts will try to answer that, so keep an eye on our Twitter and LinkedIn!

#internet of things #discoverability #iotnews #IoT #datafeminism

0 notes

thingful · 5 years ago

Text

Thingful Update – 2020

We’ve been pretty quiet the last few months, but we’re finally back with lots of news! We have been working on three huge projects in different parts of the world and want to update you on our latest work, addressing some of the most pressing challenges of our time - the climate emergency, and data ownership and usage on the internet.

DECODE – Tools that put individuals in control

Over the last couple of years, we’ve been proudly collaborating with a number of EU-based organisation, such as NESTA, Arduino and Dyne.org, and cities like Amsterdam and Barcelona, in the delivery of DECODE, an experimental project, funded by Horizon 2020, to develop practical alternatives to how we use the internet, providing decentralized, privacy-enhancing, rights preserving tools to reestablish data sovereignty to people and enable citizens’ digital rights. The project has almost wrapped up exploring how to build a data-centred digital economy where data that is generated and gathered by citizens, the Internet of Things (IoT), and sensor networks is available for broader communal use, with appropriate privacy protections. The aim has been to entitle innovators, startups, NGOs, cooperatives, and local communities to use that data to build systems and services centered on their needs and those of the wider community - watch this video for further information. Here you can read more on the wider social value that comes when enabling individuals to take control of their personal data and giving them the means to share their data on their terms in the community pilots in Barcelona and Amsterdam.

We believe in the potential of DECODE to validate a different approach to data collection and usage other than the manipulation and commercial exploitation of personal information by a very few tech Behemoths, and while the project is over, we’re looking forward to seeing how DECODE’s outcomes transform the industry.

GROW Observatory – A Citizen Observatory to take action on climate, soil, and food

We also recently wrapped up our work on GROW Observatory, a citizen’s observatory mobilising a movement of food producers across Europe. Along with a consortium of eighteen partners, we worked on improving high-value data management to foster regenerative cultivation techniques, soil management, and sustainable production that help the broad adaptation of agriculture in the EU to climate change. For us, GROW is particularly important for the way it contributes to the UN’s 2030 Agenda for Sustainable Development Goals (SDGs) - watch this video for further information. To know more about the people in the movement, read the stories of the GROW communities around Europe. Our contribution was both technological (building some of the data capture and sense-making systems) as well as more general (developing the financial sustainability plans and models). We would like to thank all partners of the consortium, particularly the University of Dundee, University of Edinburgh, and Future Everything with whom we worked long nights on some of the final deliverables!

Project Alva (internal working title)

The third project that’s kept us enormously busy is still under wraps, but it’s the one we’re most proud of, so do stay tuned for future announcements coming as soon as it’s publicly launched. What we’ve been calling Project Alva was created to further our mission promoting data accessibility, transparency and literacy in education. Keep an eye on our social media platforms to be the first to know more about it when it’s out!

We are glad to be back - more blog posts coming soon, so keep an eye on our Twitter and Linkedin!

#weareback blog internetofthings iotnews IoT discoverability

0 notes

thingful · 8 years ago

Text

Experimenting with multistage Docker builds

We’ve been general agnostic about Docker round these parts; the value for setting up complex local development environments is obvious, but the idea of taking these containers to production was a bit more scary (at least to me). However I think the production story is is much more clearly defined now, and with products like Kubernetes, or Amazon’s EC2 Container Service, we are finally starting to experiment with using container based packaging to deploy some production services.

This post isn’t talking about the deployment side of things however, it’s rather a little note discussing some experiments we did recently playing with a relatively new Docker feature called multistage builds. (Note: multistage builds requires Docker version 17.05 or later).

There’s lots of information on the Docker documentation site about multistage builds so I won’t go over these details again, rather here I just want to post a link to a little repo we created to experiment with what this workflow might look like.

Here’s the Github repo in which our simple project was created: https://github.com/thingful/simple

There’s not much to the project, it’s just a tiny HTTP server with a couple of endpoints, however it contains a multistage Dockerfile that looks like this:

# build stage FROM golang:alpine AS build-env RUN apk --no-cache add build-base bash ADD . /go/src/github.com/thingful/simple WORKDIR /go/src/github.com/thingful/simple RUN make test && make compile # final stage FROM alpine RUN apk --no-cache add ca-certificates WORKDIR /app COPY --from=build-env /go/src/github.com/thingful/simple /app/ ENTRYPOINT ./simple EXPOSE 8080

Looking quickly at the above you should be able to see that we run our tests and compile our binary in the initial build stage, then we copy this generated binary into a new container to generate our final artifact (for details on the test and compilation steps please see the Makefile in the root of the repo).

What I liked about this:

The multistage syntax is clean and I think clear

The final built image is indeed nice and small - this tiny server produced an image of around 10MB (it could have been smaller probably if we’d have used the scratch container for the final stage, but that would have required more hacky manoeuvres to handle things like SSL cert bundles).

What I wasn’t so sure about:

In the Dockerfile above I inserted a test step directly in just before running the compilation task, this was fine for this little toy example but that might be annoying in larger projects where running the full test suite might take a while.

In relation to that I’m not sure if there would be a way of running just the tests without necessarily building the final container or more generally I don’t think there’s a way of getting inside the build stage container should you wish to. Maybe that’s a foolish thing to want to do - I don’t know.

As always any comments or feedback would be gratefully received.

#docker #development #containers

0 notes

thingful · 8 years ago

Text

Interoperable Internet of Things – BIG IoT Project

When we started out in 2013, our hypothesis about the central value proposition in IoT was around interoperable data from across networks, which was searchable and accessible. This hypothesis, we realised four years later, was ahead of its time despite the rational appeal of the idea. Most data owners that collect data from their own networks of devices are generally not interested in differential discovery or access to that data outside the main use case for which the device network was deployed in the first place. However, what we have started witnessing is the shift towards data driven business models powered by connectivity of devices that enable asset or process monitoring.

For example, rather than selling telematics black boxes as a hardware, which has a limit to its growth and value, there is a real consideration towards data linked analytical services to businesses that buy the black boxes. An insurance company or a commercial fleet operator is only interested in getting actionable information about the assets rather than managing hardware and data. This shift in business model is likely to depress upfront hardware revenues for such businesses but holds the promise of generating long term recurring revenue streams. The cash breakeven on the investment will require sustained subscription and growth of subscriber base and potential monetisation paths outside the traditional customer base. This is where a truly interoperable IoT holds its power and promise. Across networks within a single domain, say transport, the description of data is highly proprietary in terms of semantics, messaging, formats, storage, and protocols. We expect a steady move from connected hardware businesses to shift the business model towards data powered services.

In this move, external IoT data is highly relevant and valuable to augment and differentiate the offering and while the subscriber base ramps up, other monetisation avenues for data; whether through aggregation, or differential sharing of non-identifiable differentially shared outputs would remain very attractive. The deployment of devices will continue apace and the need for interoperation between networks of devices will become more and more attractive as a proposition. For now, potential customers understand the high level logic but don’t have clear-cut use cases from which a RoI based model can be derived. This is a natural state of play in absence of an apparent ‘problem to be solved’, however the need for differentiation is pushing businesses to explore the possibilities. Data has some strange characteristics, as combining it with data from other diverse sources is likely to generate novel insights based on ‘correlation’ for which causal linkages may be hard to imagine. The power of correlational analysis is not a stranger in the financial world where algorithmic trading is based on identifying correlations and making buy-sell decisions on such relationships. Other attempts to make IoT data interoperable are driven by a vision of ‘IoT data marketplace’. Once such project is the Horizon 2020 funded BIG IoT, which is an interoperable market place for sensor data from a variety of domains without a centralising data store. We will be working with BIG IoT to enable an interoperable link between Thingful’s public data search and access service with the market participants active in BIG IoT across the three European cities and project partners like Siemens and Bosch with a focus on transportation.

We will start the project in September and build a software bridge between Thingful and BIG IoT for seamless data search and access (the first of its kind) and build a common semantic vocabulary to provide very powerful search and access capabilities in both directions (from Thingful to BIG IoT and vice vesra). This project will be a key demonstrator of Thingful’s IoT data accessibility services and expand its index in the transportation domain.

#iot #telematics

0 notes

thingful · 8 years ago

Text

Glimpsing use cases for Edge Computing

We are developing Device-Hub, a software that aims to work on the edges of Internet to retrieve, process and send/publish data from IoT devices. From a quick glance, one can think that its use could be limited only to a few cases (Houses & Offices environment automation), but there are a lot of cases, especially industrial applications that can be addressed. Edge computing isn’t an evolution of these though, but an informed reuse of technologies that are leading other fields in industrial closed environments.

The advances in the industrial fields hasn’t been as fast as the web technologies advance yet, and that slowness to embrace new technologies and approaches has been seen as a strength of the old and classy technologies used for years in industrial fields. Industrial users need reliability, but today those users are seeing how is easy to get real time information from the Internet and wondering if they can get some of that in their industrial environments.

The main challenge is that industrial users are mostly limited to a narrow offer and have to follow the pace of big industrial OEMs, and in a lot of cases they can’t afford to get an update of the technologies the OEMs provide; yes, classic OEMs have the experience and their own proprietary technologies that in a lot of cases are very expensive, but today the technologies are more standard and that’s the trend - industrial sector can’t be the exception.

Today edge technologies can displace Data logger technologies, collect and process data from PLCs (Programmable Logic Controllers) and PACs (Programmable Automation Controllers). Edge technologies can talk with sensors, pumps, motors, relays and using any protocol you want, and yes, even can talk with ModBus devices if you want.

Some important issues that edge Technologies try to address include:

Improving system response time in remote mission critical applications

Reducing the amount of data sent to the cloud

Decreasing Network and Internet latency

Improving the performance of processing power and communication capabilities of different appliances

If you are interested in these subjects, please get in touch.

#iot #industrial

0 notes

thingful · 8 years ago

Text

Documenting APIs with Swagger

During the last few weeks we have been discussing how to best document the Thingful API and what tools we should use to accomplish the task.

Despite being a necessary step for making the software and APIs usable to both internal and external users, writing and maintaining documentation is not necessarily the most exciting job developers want to do. It is usually more interesting to write code then describing how to use it.

Luckily for us, there are tools we can use to make our life simple and fun, even when it comes to defining API manuals and Swagger is a perfect example of a modern and popular framework that helps accomplish that.

According to its own definition, Swagger is a set of specifications that provides a language agnostic interface for designing, building, testing and publishing documentation about the capabilities of a RESTful web service.

On top of the tools that come with Swagger, which include but are not limited to an online editor and a customisable user interface, there is a wide range of libraries that enable developers to use their programming language of choice to generate OpenAPI compatible swagger documentation.

Since Golang is one of the main languages used at Thingful, I decided to explore the go-swagger package and try to document a little demo application.

One of the key requirements was to be able to generate swagger specifications directly from code annotations in order to keep definitions of actions, routes, response and request bodies as close as possible to the code that generates them. Go Swagger can accomplish this through special tags that can be used as code comments.

Since the demo app I built has only one end point I tagged the handler function in charge or dealing with HTTP requests and response with the swagger:route GET /items item-operation comment. This route tag defines the HTTP method the handler accept (in this case GET), the path where the handler will be invoked (/items) and any ID we want to use to refer to this route. IDs are very useful for associating a route with any other operation, such as responses or requests, with the same tag.

I also specified what responses the route will generate using the Responses: 200: itemResponse 404: errorNotFound tag.

This block of comments specifies that in case of a success the handler will return a 200 status code and an itemResponse object. In case of an error it will return a 404 status code and an errorNotFound object. Again itemResponse and errorNotFound are IDs that are used to associate a type defined somewhere else with the current route.

Since the handler allows for an optional query argument, I defined an object to use to hold the query value and commented it with the special swagger:parameters item-operation tag. Notice that I re used the item-operation labels previously defined in the route tag.

Since responses can be of two types - success or error - I defined a success object (or struct in Go lingo), and labelled it swagger:response itemResponse. Similarly I defined an error object and labelled it as swagger:response errorNotFound.

All these comments and tags can then be parsed by Go Swagger, translated into JSON (or YAML) format and outputted to a spec file that will be consumed by the Swagger UI.

Although not perfect I really liked this approach since it allowed me to quickly and easily generate documentation for my little application. Specifically I found it very useful to be able to describe how to query and render resources at the same time as I was writing the code actually responsible for doing it.

The final documentation can be found online on the swaggerhub and the demo API can be accessed at https://simple-swagger.herokuapp.com/items.

The code used to generate the API and the documentation is available on github at https://github.com/thingful/simpleswagger.

I encourage anyone interested to download and/or play with the demo, as well as sharing any comments or insights about swagger and its usage.

1 note · View note

thingful · 8 years ago

Text

The challenges of semantic structures in IoT metadata (part 1)

Introduction

Over the past few months we have been developing features for semantic search on Thingful, it has been very exciting. We want to share with you many aspects of it but there is a lot to cover, so this blog post this will be an introduction followed by more articles covering these features.

There are so many good reads out there about Semantic technology (please see at end end of this blog), but I want to discuss this from the practical point of view, from our own experiences. What was the problem we had, and how do we tackle it.

Indexing

We launched Thingful.net, the search engine for the Internet of Things in 2013. Thingful.net main user interface is a map. Users can search using keyword (What? and Where?), then the map will update to display the search results which users can navigate through.

By the end of 2016 we indexed millions of connected objects. Our approach when we indexing things was that we will be true to how the data providers provide the information. We try not to touch or manipulate the data. We want to keep it as close to the original as possible.

The example of that if the provider named the channel as “bikes” or “NO” , we will also have “bikes” or “NO” as channel names on Thingful.

There are so many reasons behind this approach. The important one is the “purity of data”, we are only the medium. We don’t have the knowledge to manipulate the data. And even if we have the knowledge, we may not have the permission to do so.

Map search is broad

So how did Thingful Search work? We do string match in our database on many properties of Things. Among them is a property called “metadata” which contains as much information as possible to cover all the possible keywords that users might use to search for this specific thing.

For example, an OpenAQ thing will have following metadata fields

“Pollution, particulate matter, nitrogen dioxide, aerosol, respiration, asthma, respiratory problems”

The metadata field's purpose is to be as broad as possible. And that is by design to serve the map search. We want to make sure that related things to the keywords always show up in the search result. This approach worked fine for us, because it is relatively easy for users to browse through things using the map.

API search is specific

In 2016, we released the beta version of Search API. The API worked similarly to Thingful.net keyword search.

As part of our own validation, we did an experiment internally to test our API. The experiment was to make an app that requires data interoperability. The first step is to search for ______ regardless of data providers, networks, countries or languages.

We found out that the user’s expectation for API search is very different than map search on Thingful.net. As an API consumer, (as an application or application developers), you tend to want the “correct” or definite answers. Filtering the result might be OK for human users on the map interface, but API search is different, there should be no filtering required.

Because the difference in context, the map search is broad while API search tends to be more specific. And it is wrong to assume that they are the same.

Purity of data = Untidy data ?

“Same” data can be named very differently. For example, one provider might call “the number of available rental bicycles at bicycle dock” as “Bikes”, while another provider might call it “Available bikes” or maybe even in other languages. As a consequence, searching for “Available bikes” will not return result of “Bikes”.

This is somewhat intentional due to the “purity of data” policy. But this makes using data across different networks really hard.

To make that app work, I ended up with a list of names that (I thought) were the same!! so that the app could work with different data providers. Code snippet below was the part of the code where I defined the different channel names:

const CHANNEL_NAME_TEMPLATE = {

"availablebikes" : ["availablebikes", "bikes"],

"windgust" : ["windgust"],

"windspeed" : ["windspeed", "weatherwindspeed", "wundergroundwindspeed"],

"no2" : ["no2", "nitricoxide"],

"feelsliketemperature" : ["feelsliketemperature", "outsideairtemperature", "weathertemp", "wundergroundtemp"],

"precipitationprobability" : ["precipitationprobability","rainfall"],

"uvindex" : ["uvindex"]

};

Please note that the code was made around the end of 2016. And It was for demonstration purpose. It wasn’t my intention to made it incorrect but 6 months have passed, now I have a bit more domain knowledge and I have spotted a few mistakes in the snippet. For example;

no2 is not the same as nitricoxide,

feelsliketemperature is different from the rest of its group members because it is calculated using air temperature, humidity and wind speed.

However, this proves the point that manipulating data requires a lot of domain knowledge.

New direction & questions

As a result of that experiment we decided to improve the usability of Search API by making it more specific, on two parts

The structure of search: By allowing multiple parameters this enables search by unit type and measurement type for example.

The meaning of search: The keyword that use to search on those parameters has to be a “common” language, something that is universally referred and understood.

The second part in particular has set us on the path of Semantic Thingful. Where things we indexed can be described using common language, then we will share the same understanding.

However, this direction raised many questions including:

Purity vs Usability of data? Who has the right or knowledge to manipulate the data?

What do we sacrifice for making the search more specific? Accessibility?

Do we have to broaden the search again on the map interface?

Next blog will be about our approach to the semantic structures in IoT metadata.

Useful reads on Semantic Web

Original proposal by Tim Berners-Lee in 2001

Introduction to Linked Data and Semantic Web

Real world example, Google’s Structured Data uses vocabularies from http://schema.org/

JSON-LD, JSON for Linking Data

#introduction #semantic #search #problem

0 notes

thingful · 8 years ago

Text

Introducing the DECODE project

Since the beginning of this year thingful has been a member of the consortium behind the DECODE project.

DECODE is a European Commission funded project exploring and piloting new technologies that give people more control over how they store, manage and use personal data generated online. We will test the technology we develop in two pilot sites and will explore the social benefits of widespread open data commons.

Thingful’s involvement has been to bring our experience in IOT, data entitlement and semantic tagging to the project.

One of our first contributions has been to contribute to a survey of data entitlements and blockchain technologies alongside our colleagues at University College London's Information Security Research Group under the professorship of George Danezis.

Our involvement is set to continue over the next three years and we are currently very excited for the potential for the project.

#blockchain #decode #entitlement

0 notes

thingful · 8 years ago

Text

Data licenses - CC or not CC

We've spent a lot of time focusing on getting data into our platform but until recently we haven't had a very good focus on how we might make our data useful and usable. These goals obviously cover an enormous amount of ground but specifically in this short post, I'm talking about the licensing of data, and giving some details of the approach we have adopted to make the licensing information explicit in how we represent data from a particular source, and once explicit, how we use that information as part for our data search while meeting license constraints.

It is important to note that we disclaim any legal advice with regards data licensing issues, whether implied or otherwise.

There's a lot of work out there already to do with data licensing, for example this guide published by the Open Data Institute has a lot of general information for data publishers on how they might make their data available under an open license of some sort. In addition there are many open data licenses out there ranging from the Creative Commons licenses, licenses published by the Open Data Commons, and the UK government has developed and released an Open Government License for data published by public sector bodies.

The question of what kind of license to use for your data is a question beyond the scope of this post, but our interest in this area was sparked by finding out about a document published by the Creative Commons which proposes something called the Creative Commons Rights Expression Language (CC REL).

CC REL is an RDF vocabulary for describing the permissions, requirements, and prohibitions of data licenses in a neutral but rigorous way. This vocabulary for describing licenses provides a set of building blocks by which any data license may be represented.

The way we plan to represent this information for individual connected devices (’things’) is by capturing meta data like this:

{ "data_license: { "name": "Open Government License version 3.0", "url": "http://www.nationalarchives.gov.uk/doc/open-...", "legal_code": "http:///www.nationalarchives.gov.uk/doc/...", "permits": [ "cc:Reproduction", "cc:Distribution", "cc:DerivativeWorks", "cc:Sharing" ], "requires": [ "cc:Attribution", "cc:Notice" ], "prohibits": [] } }

The format of the above is still slightly in flux, but by capturing this data where it is available we start being able to answer questions for a data searcher like: "Can I reproduce this data?", "Can I create derivative works based on this data?", "Do I have to attribute the source of this data?", "Does this data prohibit commercial use?", etc. Being able to expose and query this information, we feel changes data from being something vaguely interesting for hobbyists, to being something that could be actually used in an academic, commercial or governmental setting.

Of course the stinger from the above paragraph is that dreaded phrase - “where it is available”, as when data is published without any explicit data license there is nothing meaningful that can be said about whether anyone can use it in any context.

This post is already getting too long, so I won't get into the details of how we are planning to make these properties searchable, but if you'd like to find out more, or you have any comments on the above, we'd love to hear from you.

#data #data licensing #creative commons #ontology

0 notes

thingful · 8 years ago

Text

Funding for Connected & Autonomous Vehicle Pilot Scheme in London!

We can finally announce today one of Thingful's biggest projects yet... We are part of consortium that will deliver an on-road mobility service using connected and autonomous vehicles in London's Queen Elizabeth Olympic Park! This is huge for us, because it's a real-world project that dives into the phenomenal complexity of IoT data delivered in dozens of different formats, in different semantic domains, using different standards and owned by different players in the data ecosystem -- all of which make Thingful shine! Building on our previous work leveraging IoT data in Connected Cars, Thingful's particular role in this project will be to develop the real-time data interoperabilty and entitlement frameworks that enable autonomous and connected pods on-demand (PODs) to both find and access necessary data, and to enable the PODs to control how vehicle data generated internally is discovered and accessed by others. A large chunk of that will enable interconnections between the intricate traffic management system and the PODs fleet management system. We're part of a stellar consortium of 20 partners led by AECOM, including Heathrow, Dynniq, Transportation Simulation Systems and University of Warwick that has been awarded more than £4.2 million of funding from Innovate UK and the Centre for Connected & Autonomous Vehicles (CCAV). The CAPRI (Connected & Autonomous POD on-Road Implementation) project consortium was awarded the funding as part of a CCAV and Innovate UK competition to invest £35 million in industry-led research and development projects on CAVs. It's going to keep us busy for the next couple of years. Needless to say, we are hiring, so if you want to work on complex socio-technological problems for the internet of moving things -- get in touch! And stay tuned, we have some more big announcements to make over the next few weeks!

#connected-cars #autonomous vehicles #ccav2 #iot #interoperability

0 notes

thingful · 9 years ago

Text

Thingful now indexes TransportAPI

We add and update data resources in Thingful’s index every single second because there are millions of connected objects out there for us to discover – so it’s not typically a big deal for us when we add a new repo to the crawler. But TransportAPI, a digital platform for transport across the UK, is different.

I struck up a conversation with TransportAPI CEO Jonathan Raper almost by chance at the ODI Summit last year, as we both found ourselves sitting on the same couch checking email. We talked about data APIs and compared and contrasted our businesses, both built on the idea of making real-time data easier to find and use. What was really interesting to us was where the two differed.

TransportAPI takes one single vertical and goes as deep as it can go in normalising data and providing a unified interface, making it the most comprehensive access point for transportation data across the UK. Thingful on the other hand operates horizontally, integrating across dozens of verticals to provide the most comprehensive discovery framework for all sorts of IoT data around the world. Where TransportAPI is narrow and deep, Thingful is shallow and wide. A collaboration seemed both a meaty challenge and a great opportunity to showcase the benefits of each approach.

The first step has been for Thingful to start indexing TransportAPI, which is no small task. While TransportAPI has a feature-rich API, making millions of geolocation queries for each of its individual endpoints (live departures from train, tube and bus stops from across the UK) would have been burdensome. Instead, we worked with them to deploy hypercat (for automatic resource discovery in the internet of things) as part of their stack. (For all the hype around hypercat, it is essentially just a simple specification to provide a catalogue of IoT resources in JSON format). This enabled them with minimal effort to list individual resources (i.e. bus stops, etc.) that Thingful could then index and make discoverable on an individual basis.

As a result of this work, Thingful users can now find train, tube and bus stops by geolocation along with all the other nearby connected objects like weather stations, pollution monitors and flood sensors, and while we continue to build on this collaboration we hope meanwhile this will drive more users towards TransportAPI as well!

If you are interested in knowing more about this work, or if you would like help in implementing hypercat, please get in touch - we would love to talk.

0 notes

thingful · 9 years ago

Text

Thingful pivots. Or, the Evolution of an IoT startup

Most startups evolve constantly, looking for the best market fit. But the market itself is also constantly evolving, especially in response to technological development and social transformation, which makes any individual startup's story pretty complex. Sometimes it’s called a pivot. Usually it’s an evolution. Thingful’s evolution, particularly because we’ve been so early to the market, has been a paradigmatic example. We’ve continually evolved over the last three years, learning something new at every stage, always challenging what we’re doing in search of the most valuable proposition. Here's a recap of what we’ve seen.

In the beginning: Searching public IoT data (2013)

When we launched in December 2013, as a basic search engine for the internet of things, we focused solely on indexing public IoT resources, because just as in the early days of the web we saw our challenge to be making it easier for ordinary folks to find stuff that's already out there and already public. We had decent traction, considering just how early we've been, with hundreds, sometimes thousands, of search queries every day and average dwell times on the site exceeding 10 minutes some months. We learned that ordinary folks, at least some of them, are excited about tapping into the wider internet of things, and Thingful has even been touring the world in kiosk mode as part of the phenomenally successful Big Bang Data exhibition (currently in Singapore).

Thingful watchlists & notifications

Public vs. private data: Discoverability & Entitlement (2014-15)

As the industry evolved over the last couple of years, with more and more IoT products coming online, and established businesses getting connected or using IoT data, and as people have begun to understand the possibilities and limitations of connecting physical objects to networks, the question of how discoverability interacts with *private* IoT data resources has become paramount.

ODI Data Spectrum

This really came to the fore during our time as an Open Data Institute startup, where we used the Data Spectrum to qualify data indexed by Thingful. The promise of IoT is that everything talks to everything else, but when so much IoT data is private, personal and valuable, how can that be done safely and securely, and with owners' explicit consent?

Finding connected objects near you that you’re entitled to access

We learned that structured and secure interfacing between public and private IoT, and all the shades in between, is going to be absolutely key, as connected technologies get deployed in our cities. Our guiding principle has been that data owners, whether individuals or organisations, must always be in control of who can find and use their data, and that those controls must be fine-grained (i.e. tweaking different settings for entitling different possible data clients). And so Thingful, as a backend service, evolved into a discoverability service founded on the principle of entitlement. See our Urban IoT and Connected Car infrastructure projects by way of example.

Thingful entitlement consent

A detour: IoT blockchain & lambda functions

Along the way we did A LOT of research (and even some development) on blockchain, ethereum, smart contracts, compute services, etc. Turns out we're not ready for such microtransactions just yet. Or maybe they're not ready for us. We learned a lot, so it wasn’t a waste of time, but we'll have more on that at a later date. Meanwhile, check out Connecting IoT with Blockchain by Sam Davies, which for now has much more interesting and practical things to contribute on the subject.

Defining an IoT service product: research & interviews (2016)

Once we had several real-world projects and proofs-of-concept under our belt or under development, which helped us both build and test the technical concepts that were otherwise led by intuition, we focused on refining precisely what Thingful, as a product, actually is.

As part of that process I've spent the summer of 2016 calling on friends and colleagues around the world, to get their informal take on the state of the industry and where they see the most interesting opportunities. Here are some of the fascinating and productive ideas that have come out of those discussions:

Scott Jenson (Apple, frog, Physical Web) emphasised that we should try to "tame the chaos". When he suggested that I should "think like the web" (i.e. in multiple layers of accessibility) it made even more sense, and chimed with my intuition that we should embrace the heterogeneity of IoT data, and find ways to make that heterogeneity powerful, rather than simply ironing out the wrinkles.

Boris Anthony (Dopplr, Nokia) spoke at length about context, and helped me understand that sometimes what you think you're searching for is better informed by all the stuff around what you're searching for, and that if it's not there, sometimes that other stuff will do. This notion contributed to our experiment on virtual sensors for 'missing' data.

Lee Omar (CEO, Red Ninja) strongly recommended making data access and assessment as simple as possible. He described his business in catering to those wanting to get involved with IoT (e.g. cities looking to deploy air quality sensors), which helped me understand Thingful's role in supporting that process (i.e. not doing the deployment ourselves, but instead providing the discoverability and interoperability tools to make that deployment even more useful).

Mark Cheverton (CTO, Red Gate) echoing some of Scott's suggestions, had some great advice on finding ways to make it easy to deal with 'mess', (which is a Red Gate forte!) and cautioned against a pure data market business (which is "great for number one, but not so much for anyone else").

One of my favourite brainstorming sessions was with Ben Blume (Atomico), in which we explored what's possible when available IoT data isn't "good" enough and how machine learning can help (another conversation that fed into our 'missing' data experiment). We also talked about how PaaS offerings can sometimes undercut (from an investment perspective) the defensibility of customer apps that are built on top of it, which has helped us consider how best Thingful can empower rather than restrict (i.e. to be a 'bridge' rather than a 'toll road').

These conversations and a dozen others have been invaluable for me in questioning all our assumptions, and making sense of all the things we have built since 2013. I’m extremely grateful for everyone’s openness and generosity. So thank you to all who contributed!

Notes from summer research conversations: Matt Webb (CEO, Berg), Matt Biddulph (CTO, Thington), Dominique Guinard (CTO, Evrythng), Pilgrim Beart (CEO, DevicePilot)

Bringing it all together: interoperability glue

After this period of solid introspection, I now understand that Thingful at its heart is interoperability "glue" – glue that binds together messy networks.

Thingful binds internal networks (e.g. cities, or facilities like airports that have multiple data platforms and systems that need to coordinate); binds closed networks to each other (e.g. connected cars securely sharing data with dozens of services near and far); binds closed networks to the wider IoT universe (e.g. emergency services that need unforseen data at unforeseen locations or smart home products that need sensor data from the city around them); and helps ordinary folks in cities make sense of and make use of all the public connected objects that already exist around them.

One thing that has become very clear is that Thingful is not a platform, in the sense that most IoT platforms describe themselves as being something upon which others build their products. There are THOUSANDS of IoT platforms, and they almost all cater to what I would call 'first-generation' IoT deployments – helping connect physical objects to a network. Platforms tend to be single points of failure: you outsource, and you become dependent. That’s why so many individuals and companies tend to build their own. Outsourcing makes a lot of sense to first-generation IoT deployments, but most astute observers of the IoT industry have become wary of founding entire businesses on other people’s platforms. If walled gardens are going to persist, for logical reasons, then the next phase of IoT is going to be bridging those walled gardens.

Thingful by contrast caters to 'second-generation' IoT deployments – helping organisations that already have platforms of connected objects by making those objects more useful, both within closed networks and in interfacing with the wider world. Essentially, Thingful connects platforms of objects to each other, rather than objects themselves.

If all the connected 'things' in the 'internet of things' are to 'talk', they need to be able to find each other, negotiate for access, unlock the right channels, translate between dozens of different formats, protocols & authentication methods, semantically understand each other, and ultimately conduct some kind of exchange (possibly financial) – across and between all these thousands of platforms. And this is precisely what Thingful makes easy, straightforward and secure.

Thingful today – a search engine for the internet of things, bringing interoperability to connected objects around the world

Which brings us to today. Since Thingful deals with connected objects that already have platforms, binding together those platforms across the data spectrum, it’s really a set of services, any of which you can choose to use.

That means Thingful is more than just a platform: we don’t expect people (companies, organisations, cities) to build their entire business on it and get permanently and riskily locked-in. We expect them to use it to amplify and add value to their existing business. There are likely to be many competing interoperability search and protocol mapping services in the end, and people will use whichever one seems most convenient at the time. That will keep us on our toes! Of course, we hope to benefit the more of these there are.

That’s why we have architecturally separated out, very explicitly, the search functions in the API and the access functions (much like Google and Chrome are two different things). You can use Thingful’s Search API to find what you need, and then use its result to connect to the data resources directly without further use of Thingful. Or you can use the Access API for the convenience offered by the same parser mapping functions (including protocol and format translation) that power Thingful’s indexing to retrieve unified and somewhat normalised data from both inside and outside your own network. You can use either interchangeably.

Once your network is indexed with Thingful, you can search, organise, access, respond to all the data within it. Eventually, and only if you're ready, you can then unlock it to control how and whether others can do the same with any part of that data. Every part is optional, and you’ll only use the parts that are explicitly valuable to you.

Some of the initiatives where Thingful’s ‘interoperability glue’ is being applied right now around the world include:

In the UK, our Connected Car Infrastructure project and several initiatives with cities (more to come later)

In the EU, part of GROW Observatory, a citizens' observatory for family farmers, gardeners and growers, funded by Horizon 2020

And in Singapore as part of the IDA’s Internet of School Things initiative

Whether you call our journey a series of small pivots, or just iterations in a smoothly evolving process, the point is that we use everything we do to learn more. So expect Thingful to continue to evolve as the market matures!

And so, with that, and with big thanks to everyone that has helped shape this conversation, Thingful, the website, has finally been updated to reflect what we've actually been doing all these months! Only small changes to the site right now, but bigger things will come over the next few months.

Stay tuned, and do get in touch if you’re interested to know more.

4 notes · View notes

thingful · 9 years ago

Text

Virtual sensors: using Thingful and data science to fill in ‘missing’ data

Thingful indexes millions of public and private connected objects around the world, all together generating vast quantities of data, from across dozens of domains. And yet there are still huge gaps. Try tapping into the ‘pulse of the planet’ in areas where sensors are sparse and you’ll come up empty handed. If you need temperature data at a particular location on the planet, in the absence of actually installing your own sensor right there, your chances of finding an existing temperature sensor that you are entitled to access, are pretty small. But there may be other data available nearby that could help infer what a temperature sensor might read, if there were one available.

So this summer we have been conducting data science and machine learning experiments to see how Thingful might ‘fill in the gaps’ of ‘missing’ data to create ‘virtual sensors’, by drawing on its vast index of multi-domain data.

Data scientist Usamah Khan, who has been working with us this summer tells us more about the results below.

Experiment to predict ‘missing’ data

“Thingful” indexes dozens of IoT data repositories all over the world, ranging from environment, traffic, health and technology sensors. All these objects are connected and report geo-location and time-series data. The sensors all collect information on their own networks to get a glimpse into what’s happening in the world around us. But that’s only if we look at what they want to tell us. But what can the things in our environment tell us all together?

Suppose we want to get a glimpse of temperature in real-time. Take the area of a city, divide it up into a grid of small segments and find the temperature in each location. To do this we’d need thousand of sensors normalized and of a consistent accuracy. At this point in time, the resources just doesn’t exist. However, we have other data; a lot more “things” connected that surely relate to one another. With this is mind, can we estimate, with a reasonable degree of confidence, the temperature at every location through a combination of the following calculations:

Interpolation between the sensors we have

Correlation calculations for non-temperature sensors with similar sensor ranges that correlate with an X-Y range of temperature, e.g. air quality monitors, traffic sensors, wind, pressure, etc.

This was the purpose of a project that took place at Thingful during July. With a hypothesis we had to decide on goals for the experiment and ask what would we consider a satisfactory result?

Prove that we can infer and impute information we don’t actually have in this framework

Prove that a model can work by creating and testing it on our known data

We chose London for our analysis because this was an area with data most easily available to us. Since the data we’re trying to predict is time-series (temperature) it made sense to pull data from the same time.

Since we were pulling a lot of data we needed first to see how it was spread around London.

There was a huge spread and not entirely centered. To get a better idea of the longitudes and latitudes we were dealing with, we looked at the points on a Cartesian plane.

Inspecting it we found a large concentration of sensors in Central London and adjusted our limits.

We began by building a grid and defining the precision we wanted to achieve for our model. We had two options, either a larger resolution for a precise idea of temperature or a smaller resolution to get more of a spread of data.

After building a grid we associated all the sensors to each segment by using a clustering algorithm. This way, we had each sensor correctly associated with a segment and we could begin finding correlations.

We then widened the data to understand the spread of variables. Plotting a heat map of temperature gave us an idea of where data was missing. As it turned out, at this resolution the spread wasn’t quite what we hoped for. But more so for reasons we discovered later.

The next step was to build a system to predict temperature. We found Machine Learning applying random forests worked well. Random forests are an extension of the decision tree algorithm. While decision trees classify by making branches until a classification is determined, random forests repeat the calculation with a random starting point over and over again to create a virtual “forest” ensuring a more accurate result. Though random forests typically predict best for classifications or discrete outputs we found that since our temperature did not vary greatly and was recorded in integers we had a range of 5 buckets from 16-21 C as our output. So random forests could be used effectively.

The result gave us an accuracy of 71% when we compared our prediction on the training set with the actual measured results. Not quite the result we were hoping for, but adequate for a first prototype.

This essentially means that, using the model we developed for this experiment, we can use nearby air quality, traffic, wind, pressure and other environmental data that Thingful indexes, to predict with 71% accuracy what the temperature will be at a given location.

The biggest issue for us was a lack of data, both in quantity and in variability. We determined that pulling more data from a wider breadth of categories, for example including transportation and more environmental data, could help with the model.

The final step in the process was to build a system where we could predict the temperature in areas where we don’t have that information. Since most of the data was pulled from the same sensors, we found areas with no temperature data were also areas where little other data exists. Where there is no data, there’s no correlation and hence no information to make a prediction on. So, at this point, we couldn’t finish this step. But this told us a lot about what we were trying to achieve and how we were going about it.

This was just the starting phase; an experiment with the simple goal of “Can this be done?” - Something that couldn’t even be attempted without Thingful’s framework. After more experimentation, research and development Thingful might be used to build such a tool on a global scale. The question we’re all interested in is how will this change our context and interactions with our environment?

Stay tuned for further developments – and if you would like to know more about this work and where we are taking it, please get in touch.

#data science #machine learning #showcase #internet of things #iot

0 notes

thingful · 9 years ago

Text

IoT innovation - with thanks to our funders and supporters!

When you’re developing and deploying fundamental technology for a nascent market like the internet of things, it’s a risky proposition for most funders, and few are willing to join the ride early on. We know this very well from experience! So over the last couple of years, we’ve been very grateful for the funding and support that we’ve received from a variety of sources.

Back in 2014 Thingful was showcased at the Digital Catapult, in London, which gave us early exposure to other startups and technology companies.

youtube

The following year saw Thingful join the Open Data Institute’s startup programme, which was tremendously valuable in introducing us into a wider world of data and led to two great opportunities. The first was to be part of the ODI Summit in November 2015, where we encountered dozens of kindred spirits and made valuable connections that have led to concrete projects.

youtube

The second was that we applied for, and received, funding from ODINE (the Open Data Incubator for Europe), which enabled Thingful to expand dramatically the index of open data resources from around the world, and to implement the ODI’s data spectrum classification across all resources. One of the best things about ODINE’s support was getting to work with our mentor, Mario Lenz, Vice President Smart Service & Big Data at Empolis. Read more in Thingful’s Bold Vision for an Interoperable Internet of Things.

Along the way, as we blogged about recently, we’ve also been boosted by Innovate UK funding for our connected vehicle infrastructure project and our work to discover and access cross-domain urban IoT data.

Finally, summer 2016 has seen us working with a large consortium of partners coordinated by the University of Dundee to plan a EU-funded Horizon 2020 Citizen Observatory project. This is an exciting one, because the project will last several years and we’ll be working closely with some of the leading lights of the citizen-focused IoT world.

We have a couple more happy funding announcements to make, but they’ll have to wait for full disclosure later this year!

#news #funding #innovateUK #ODI #odine #digitalcatapult

12 notes · View notes

thingful · 9 years ago

Text

Showcase: Connected Vehicles – leveraging IoT data assets by making them securely available to trusted third parties

We were delighted to be one of the few selected companies to receive funding to help build infrastructure for connected and autonomous vehicles through the CAV (Connected and Autonomous Vehicles) competition, part of the UK government’s £100 million Intelligent Mobility Fund.

Connected vehicles are those that connect to a network, generating and consuming data, and communicating with local and remote services in realtime. Since 2001, Europe has required all petrol vehicles to have an On-board diagnostics (OBD) port, which has given rise to a wide variety of tethered solutions for connecting car data to the cloud for the purposes of tracking, monitoring and analysing vehicular data. Most recent cars have an embedded module and sim card for streaming even more detailed data directly to a cloud service, and interacting with services provided either by the car manufacturer or by other third parties.

Typical use cases for connected vehicular data include telematics-based insurance, remote diagnostics for maintenance, real-time feedback on driving behaviour and vehicle lifecycle management. But once a vehicle is securely connected to the internet dozens of other data-oriented use cases open up.

A connected car will need to access real-time data from others, including the built environment and even other cars. Nearby air quality data can be used to determine internal air conditioning settings to improve the passenger experience; environmental data can help in risk mitigation; home automation systems can prepare for arrival. Thingful obviously performs well in this context, by providing on-the-fly geolocation-based searches for nearby IoT data, as described in a previous post ‘Finding & accessing cross-domain urban IoT data’.

But less obvious is the fact that real-time data generated by a car is also valuable in other domains. Wheel sensors, light levels, rain sensors, wiper usage, headlights, proximity – can all be used by third party services for a wide range of services. For example, a weather prediction service could use realtime foglight usage data to ground-truth and improve their weather predictions. Road maintenance services could use engine data coupled with aggregated accelerometer data to determine road wear and schedule maintenance programmes more effectively. Data services accessing real-time vehicular data may even provide compensation (monetary or otherwise) for making use of it. But how would such data services ‘discover’ such private data, let alone access it, especially with the requirement to have the owner’s consent?

In a previous blogpost (Finding & accessing cross-domain urban IoT data) we discussed the complexity of acquiring real-time data from multiple stakeholders, but the other major challenge that Thingful addresses is providing a system that enables data owners to manage and fine-tune discovery and access entitlements to their data, so that they can be discovered and accessed only by those they have explicitly approved.

For example, one driver may allow only academic and non-profit third parties to ‘discover’ their vehicle, where academics get all of their vehicle data but none of their user profile, while specific non-profits get everything. Another may allow emergency services to discover and access all of their data, but commercial entities can only discover that they exist and have to request specific access to their geolocation. The discovery and access entitlements matrix can get extraordinarily complex, especially at scale.

For our CAV project, a fully-working technical feasibility, we are deploying a system to assess real-time vehicle data sharing within a decentralised system of data producers (cars) and consumers (remote services). We demonstrate a system for making vehicle data (a) discoverable and (b) accessible to a variety of third parties via a transaction management system founded on drivers’ explicit consent and incentivisation.

A car owner can use the system we are developing first to select what categories of data service user can ‘discover’ their car data (commercial, academic, general public, etc.) and second to specify precisely which data they entitle them to access on demand (raw data, user profile, summary data only, precise geolocation, sensitive data, etc.). Meanwhile, data clients/consumers can conduct API searches to find those vehicles they are entitled to find in order to send requests for access to specific data they require.

Technically, it is a phenomenally complex problem: every relationship (car to data client) has individualised entitlement policies, and so handling both searches and real-time access has to be enabled in a secure but scalable and useable way, i.e. one that keeps car data owners (either drivers themselves, or fleet managers) in control of where and how their data is found and used.

The potential of connectivity and the increasing level of sensor-based automation in vehicles isn’t currently harnessed due to a lack of a clear value and business proposition for automotive OEMs. But by developing this system we aim to demonstrate that a data service for automotive aftermarket channels, app developers and other parties (that are affected by or have a direct interest in the automotive value chain) can accelerate value creation in the connected vehicle industry, and show how a technology like Thingful enables and supports it.

Though this project, due to be completed in Q1 2017, is aimed at the Connected Vehicle industry, the framework we are deploying for fine-grained control of access entitlements to IoT data clearly applies to all connected objects, owned by both individuals and companies. Thingful is a search engine for the Internet of Things, enabling secure discoverability & interoperability between millions of public & private connected objects around the world – that means finding connected objects, and it also means enabling owners to control when and whether they can be found.

Thingful can help you leverage your data assets by making them securely available to trusted third parties outside your network while controlling, restricting and fine-tuning discoverability and access entitlements by others to all aspects of your data and metadata. So please contact us if you are interested to learn more about unlocking the true value of your IoT network or connected objects. And if you operate in the automotive industry we’d particularly love to speak to you about this project, so do get in touch!

#showcase #internet of things #connected vehicles #entitlement #interoperability

1 note · View note

thingful · 9 years ago

Text

Automatic Resource Discovery for the Internet of Things

As a search engine for the internet of things, we are often asked how we discover new IoT networks to add to the growing index. The short answer is that this is still a semi-manual process. We have an automated and generalised scalable crawling and indexing engine, but it still requires some manual configuration when we encounter new data formats or protocols. There are, however, emerging standards for the way that IoT networks catalog their resources, and Hypercat is one that we have been involved in since the very beginning more than five years ago.

More recently, we were also on the steering committee for the British Standards Institute’s ‘PAS 212, Automatic resource discovery for the Internet of Things – Specification’, to provide technical expertise and experience in the industry.

PAS 212 aims to help software engineers for IoT define a simple, lightweight and standardized method whereby publishers can advertise their resources and subscribers can automatically discover and “understand” these resources. It will allow any IoT-capable device to interoperate easily with other devices without the need for intervention from a human programmer.

In order to tap into the ‘pulse of the planet’ and interoperate, things must be able to find each other. The PAS 212 spec (commonly referred to as Hypercat) is one way to make that easier. Millions of first generation IoT devices and services are already connected to the Internet, but not connected to each other – though the inflection point is coming. There are data silos everywhere, probably even within your own organisation, but by operating across those silos and unlocking the true value of IoT data, Thingful helps you accelerate into the fast lane of next generation IoT.

If you have a network of IoT devices, implementing the Hypercat spec is one way to make that process much easier. Get in touch if you need a headstart – we’d love to help.

#interoperability #internet of things #hypercat #search

0 notes

thingful · 9 years ago

Text

Showcase: ‘London/Cambridge Cycling’ – finding & accessing cross-domain urban IoT data

The promise of the Internet of Things is that every ‘thing’ can talk to everything else. Yet the majority of IoT data, today and in future, will reside in private domains. So one of the biggest questions is how to unlock the value of IoT data by making it useable across domains, when so much of it is behind garden walls. The value of the first wave of IoT products and services that we have seen over the past few years has been connecting physical devices to a network (for monitoring, customer service, analytics etc.). But the second wave is all about enabling these products and services to interoperate, and capitalising on network effects to generate value by building upon and using each others’ devices, sensors and data. In the real world, this is a messy problem.

IoT walled gardens

In order to be useful and valuable to others, such data has to be discoverable, searchable and actionable in realtime. Some IoT platforms and services aim to achieve this by centralising all the data in a single data repository and converting it into a single data format. Given the expected scale of the global internet of things, the diversity of players, and domain-specific data requirements, we don’t believe the centralising approach espoused by big IoT platform players is scalable, either technically, financially or philosophically.

For many reasons it makes sense that IoT data owners keep their data behind garden walls. They should be empowered to control the discovery, release and usage of their data by others, while still being protected within their own data infrastructures behind their own firewalls. In selectively inviting others to make use of it (or even selling it), they must be able to control who is ‘entitled’ to their data, and in what format; and that data may be anonymised, aggregated, delayed, processed, sliced, or even delivered raw in realtime to third party data clients. More than mere permissions control, IoT data owners (whether individuals or commercial entities) need to be able to fine-tune these entitlements, to determine exactly who, how, where and when their data is used, and these entitlements will be completely different for different data consumers at different times. Read more about this in my article Managing Privacy in the Internet of Things, in Harvard Business Review, Feb 2015.

Nowhere is this more evident than in urban IoT data, a quagmire of messy data, messy ownership and messy data consumers. The most forward-thinking cities that we work with have begun to realise the benefits of distributed IoT data systems, not least the fact that they won’t be locked into cumbersome and expensive 20 year contracts with Big Business data platforms that are technologically obsolete in a matter of months. Building a technical infrastructure for making distributed urban IoT data usefully accessible in realtime by third parties in a trusted environment is tough, especially if you want to keep data owners in full control, which crucially means leaving data where it’s created and owned, and not centralising the data in a single data hub or repository.

In 2015, we partnered with kindred IoT spirits 1248.io (now rebranded DevicePilot, headed up by one of the industry’s well-known luminaries, Pilgrim Beart) to try to work on these issues. What emerged was a project, funded by Innovate UK’s IoT TechCity UK Launchpad, to help citizens decide whether or not to cycle, by drawing on a host of different real-time IoT data resources (weather, air quality, traffic, transportation, even bike share availability) along their planned route. The aim was to use realtime data in a meaningful and useful way that helped cyclists, focused on London and Cambridge.

What happens when cyclists in Cambridge avoid the rain and use their cars instead

While 1248.io handled the data analytics and recommendation algorithm, Thingful was the discovery and access point for actually acquiring the data needed to make the decision. This data was acquired in real-time using our discovery and access engine from several different originating sources, both public and private, open and closed. The problem the project solved was to build a technical framework that enables data clients to find and access private data in many different formats from various sources that they are entitled to discover and access, in such a way that the negotiation of these value transactions could be automated. In other words, the problem was to a match data clients with data producers on the fly, in real-time, without having to code for dozens of different APIs and protocols.

What emerged after a year of work, in March 2016, was a TRL6 proof-of-concept mobile app that we call London/Cambridge Cycling. The purpose of this mobile app is to help people in London and Cambridge decide, every morning, whether or not to cycle to work. In order to determine this, the algorithm needs access to air quality, weather, traffic, transportation and even bike-share data, all available from different sources with different privacy levels, using different formats and protocols, which is where the real difficulty lies. To make it even more complex, different routes or different parts of the same route require data from completely different data providers in each domain, some of them open and some of them closed.

The mobile app’s recommendation algorithm uses all these data resources combined in real-time to assess how enjoyable it is to cycle at any particular time.

London/Cambridge Cycling – main interface

In order to facilitate finding and accessing such data, Thingful provides a distributed discovery and data parsing framework, that indexes and accesses millions of data resources around the world, to go out and retrieve the required data in real-time. The mobile app can therefore search, on-the-fly, for available data resources at particular geolocations, and then use Thingful’s data access API to retrieve real-time data directly from the original IoT data source, infrastructure or repository.

London/Cambridge Cycling – processing interstitial

Rather than centralising the data in a single datastore under a single authority, this approach enables data owners to retain full control over who has access to their data, and under what conditions; and data clients are able to conduct fine-grained searches for the type of data they require, across many domains, going straight to the source to access it.

London/Cambridge Cycling – response

If you would like to know more about the app and see a demo, please get in touch, or have a look at this simplified open source web app that demonstrates the basic principles of using the API in this context.

By showcasing a real world solution to a messy problem, ‘London/Cambridge Cycling’ demonstrates how Thingful enables various data partners, open and closed, public and private, to cooperate without needing to trust any single data hub or adopt any single standard format.

Whether you deal in public or private data, open or closed networks, fragmented or tightly defined IoT infrastructures, Thingful enables you to search, organise, access and respond to real-time data both inside and outside your existing IoT network, without the scalability issues that arise from centralising data in a single repository. If you’re an IoT data owner, drop us a line to find out how Thingful helps you unlock your data’s value by controlling how it is found and used by others outside your network. And if you are building apps or analytics and want to access IoT data around the world, email us to get on the API private beta.

#showcase #accessAPI #IoT #internet of things #cycling #london #cambridge #thingful #interoperability

8 notes · View notes