The day has come – we’re finalizing our data pipeline to return data to you, our citizen scientists! It’s been a twisty road, and we’re still tweaking, but we’ve begun to build some usable products for your delecation and exploration!
We want to know more from **you** about what you want and what is interesting for you to explore, so, today, I’m going to post some demo data for you to look at and give us feedback and comments on. This is a data file from our California project that consists of polygons for each kelp forest at different levels of user agreement on whether pixels are kelp or not. So, first, here’s the file in three formats (depending on what you want) (we can also add more if asked for)
You can do a lot with these in whatever GIS software is your preference, and if anyone has examples, we’d love to post them! For now, here’s a quick and dirty visualization of the whole shebang at the 6 users agreeing on a pixel per threshold (source.
Neat, huh? You can even see where something in one image was confusing (no kelp on land!) which now I’m *very* curious about.
So, what’s in this dataset? There’s a lot, but here are things most relevant to you
threshold – the number of users who agree on the pixels in a given polygon are kelp
zooniverse_id – the subject (i.e., tile) id of a given image, if you want to just look at a single image, subset to that id
scene – Individual Landsat “images” are called scenes. So, every subject that we served to users was carved out of a scene. You can look at a whole scene by subsetting on this column. For more about what a scene name means, see here
classification_count – number of users who looked at a given subject
image_url – to pull up the subject as seen on Floating Forests
scene_timestamp – when was an image taken by the satellite?
activated_at – when did we post this to Floating Forests?
There’s a lot of other info regarding subject corner geospatial locations. We might or might not trim this out in future versions, although for now it helps us locate missing data and see what has actually been sampled.
So, take a gander, enjoy, and if you have any comments, fire them off to us! This is just a sample, and there’s more to come!
Thanks to all of our great citizen scientists! I loved this Tweet from Trine Bekkby and the Norwegian Blue Forests Network so much that I thought I’d post it. Look at that Laminara hyperborea! SO GORGEOUS!
From our kelp to your kelp, happy holidays!
As I’ve been browsing through these beautiful images of classifications in the Falklands, I realized something. One of the reasons to explore the Falklands is that there aren’t too many studies looking at more long-term kelp dynamics there. Now, I’m a Northern Hemisphere kelp forest ecologist. We know that typically many types of kelp forests start to boom in the spring, get to peak biomass in the late summer/early fall, and then get whacked back by fall/winter storms before booming again in the spring.
One of the first questions I have as a scientist, then, is do we see the same seasonal trends in the Falklands? I’m very curious what y’all are seeing, so, I started a thread on talk asking y’all to note any observations. Please also tag very kelpy images with the month they were taken (click the (i) for information) as well as the #sokelpy hashtag, so I can do a quick search by hashtag to see frequency of when #sokelpy occured. I’ll post the resulting data after we get a decent set of tagged images.
And talk about what you’re seeing – month by month, or if you’re noticing certain years have more or less kelp over in the thread!
(And, heck, we haven’t even talked about north v. south side of the islands – but that’s for another time!)
If you are reading this post, it means the Floating Forests relaunch is live – thanks to all of your hard work we were able to get through over 20 years worth of data! Special thanks to the beta testers who gave us tons of feedback on the new site. We are busy on our end calibrating the results from the first round of data and it’s looking great. I don’t want to spill the beans on a future blog post, but working with this dataset has already led us down a new path with some unexpected collaborators!
As exciting as calibration models are, today’s main event is even better! Welcome to Floating Forests 2.0! We have been hard at work with Zooniverse to make your experience even better. In addition to a shiny new website, we’ll be taking you to a new part of the world – The Falkland Islands!
The Falkland Islands are an often overlooked ecological treasure. From land they appear a windswept grassland dominated by birds and insects, one of only a handful of places on Earth with no native trees. The coastal waters, however, are a different story altogether. You’ve probably guessed where this is going – kelp! Lots of kelp! The expansive kelp forests ringing the islands more than make up for the lack of terrestrial trees. Kelp forests around the world are a haven for wildlife, and these are no different. They are an irreplaceable resource for elephant seals, fur seals, sealions, multiple penguin species, two types of dolphins, and a huge number of fish and invertebrates. A recent report1 has listed lack of awareness and information as one of the biggest threat to the Falkland Islands’ marine biodiversity, so lets generate data and get aware!
Before you dive in, lets take a quick tour of the new website – if you’re familiar with our old site you’ll already know the drill, but some things have been moved around!
As you can see, there are two buttons at the bottom. “Classify Kelp” brings you to our shiny new version of the kelp tracing you all know and love. “Kelp presence/absence” you to a new feature- a simplified, mobile friendly task that can be done quickly and easily! This allows anyone who wants to check out the project to do so even if they don’t have access to a full computer. On the research side of things it allows us to squeeze every last drop of data out of these satellite images. To make a long story short, images from different satellites are different, and these differences make it somewhat difficult to automate a filter that boots out bad images. Just like with kelp classifications, our brains are much more useful here than computers. Once enough people have tagged an image as “kelp”, into the main workflow it goes to be classified!
Across the top, you will see a number of headings.
About: Learn about kelp, the project, and the team behind the research!
Classify: Get right to the action and start classifying kelp.
Talk: This links to our talk forum where you can discuss particular images, ask science questions, get technical help, and more! We will be very active here, so don’t hesitate to post!
Collect: More on this later, but this is where collections of images are found.
Recents: Link to your most recent classifications.
Blog: Direct link to the blog you are currently reading.
The classification should feel pretty familiar. The field guide tab on the far right has been overhauled and contains many examples of phenomena you could find in these images – refer to it often! It is constantly being updated, and if you have a suggestions for additions, let us know in talk!
Beneath the image are three buttons.
From the left:
Metadata: Click this to view metadata (location, time/date, satellite number), as well as a link to the image on google maps.
Favorite: Click this to add the image to your favorites, allowing you to quickly find it again.
Collect: Similar to adding an image to your favorites, you can add an image to a collection. This way we can collaboratively sort through images, keeping track of those that contain loads of kelp, cities, or any other identifiable feature.
Once you complete an image and have clicked the green “Done” button, you will see the following information:
Here you will see a summary of the number of patches you marked, as well as the blue “Talk” button. If you had any questions about the image, this button will create a discussion thread linked back to the image. Use this space to ask the science team any questions you might have about the image. Don’t be shy, we love to talk!
In addition to these front-end changes, there have been some under-the-hood updates as well that make it much easier for us to add images or collections and even create new workflows – stay tuned for future happenings with these features, but for now go check out the new site!
- Otley H. Falkland Islands Species Action Plan for Cetaceans 2008-2018.; 2008.
Hello! You may have noticed that things have slowed down on the website. To make a long story short, thanks to all of your help we are down to the last handful of images from California and Tasmania! We have been busy cleaning the data up and getting it ready to go. This milestone has given us a chance to reflect on the first phase of the project and to get ready for some exciting next steps – more on this later!
In January, science team members Jarrett Byrnes, Kyle Cavanaugh, and Isaac Rosenthal traveled to Chicago to meet with the Zooniverse team. We were hosted at the amazing Adler Planetarium, and had an unbelievable week of planning and collaboration (and eating!). By getting the science and development teams into the same room, we were able to work through a few issues that have been nagging the project since its inception, fixing some geo-referencing issues and streamlining the post-processing of the data (in other words, what happens to the data after the kelp is classified). It was truly amazing to spend a week surrounded by talent from so many disciplines, ranging from educators to back-end web developers. I think I speak for all of us when I say that it was a unique and deeply inspiring experience!
While still under construction, hopefully most of this is a familiar sight. Our goal with this relaunch is to make YOUR jobs easier! The tracing tool has been upgraded, and we will be able to spruce up the field guide. The under-the-hood flexibility of the new system is incredible and leaves the future of Floating Forests wide open. Custom datasets and modular workflows mean that the sky is the limit! Something that I am personally excited about is the opportunity to use these tools to ask new questions, broadening horizons for research and education. This relaunch will also feature an overhauled talk section so that we can continue to communicate with all of you!
Stay tuned for more information as we begin beta testing of the new website!
Last week I had the opportunity to take part in a citizen science forum organized by the White House. It was inspiring to see how committed the White House is to harnessing the power of citizen science. A number of exciting announcements were made during the event. For one, the Federal Citizen Science and Crowdsourcing Toolkit was officially released. This toolkit, developed with the support and collaboration of over 25 federal agencies, provides step-by-step instructions, case studies, and other resources to help scientists use citizen science in their research. As you might imagine, Zooniverse projects are well represented in the successful case studies section! Then John Holdren, the Director of the Office of Science and Technology Policy, gave a talk where he announced the release of a memorandum promoting the use of citizen science by Federal Agencies. Towards the end of the forum Senator Chris Coons (D-DE) announced a new bill authorizing citizen science and crowdsourcing. This bill is co-sponsored by Senator Steve Daines (R-MT), making it bi-partisan! During his talk Senator Coons described how he and his family were citizen scientists themselves and have spent many evenings collecting data for a wide variety of different Zooniverse projects! So next time you are chatting with someone on Talk, know that he or she could very well be a senator or representative. Perhaps even President Obama has a Zooniverse account?
In between these exciting announcements there were panels on Community Science Leaders, Oceans and Coasts, Democratized Tools, Water and Agriculture, and Communities and Health. A number of really exciting citizen science projects were highlighted during these panels. These ranged from investigations of the impact of aggressive policing to surfboards that collect oceanographic data to the development of methods for utilizing indigenous traditional knowledge to our own Floating Forests! You can watch the entire forum here.
I had the honor to serve on the Oceans and Coasts panel with some HUGE names in the marine science world: Dr. Alex Dehgan, Dr. Sylvia Earle (aka Her Deepness), Dr. Daniel Pauly, and Dr. Janet Coffey. During the panel we talked about the importance of the ocean and how little we know about it. The oceans play a central role is supporting human life. Yet we’ve mapped less of the ocean floor at high resolution than the surface of Mars, Venus, and the Moon combined. We have limited information about the changes that coastal ecosystems like coral reefs, mangroves, and giant kelp have been experiencing in recent decades. Citizen science provides a powerful method for collecting data that will allow us to better understand and protect these critical ecosystems.
Well, we’ve finally hit a critical mass of classifications (well, blown past it) and other projects by science team members have boiled down (we’ll be posting about them – they’re kelpy!), so we’ve begun to dig into the data. For anyone who wants to follow along at him, all code that we talk about will be posted in this github repository.
I thought I’d begin by telling you all about how *you* have been interacting with Floating Forests. Namely, how much effort do the ~5,100 users of FF put into FF the project
Many Zooniverse projects do well from a lot of people doing just a few images each. We’re no different. We have a nice distribution of folk with many doing few images (~1,500 have done just one classification), but with a looong tail with many users in the 100 to 1000 range. See below, but note the log10 scale on the x-axis.
The average user, though, does ~125 classifications. If we put it together and look at the cumulative percentage of classifications done by users who classify different numbers of images, we see that ~25% is done by those users who classify less than ~250 images. So, our ‘super-users’ are incredibly important! Heck, we have one users who has contributed 5.15% of the classifications. The top 10 have contributed 18% of classifications.
It may still be difficult to see just how much those users are doing in comparison to users classifying only a few images. So, we’ve done what many other zooniverse projects have done, made a treemap!
It’s not only incredibly informative – with the size of each square being proportional to the contribution of an individual users – but, oh, pretty data! Enjoy!
Note: This post is from Briana Harder, our newest Science Team member! We encountered Briana in Talk where she not only noted some issues, but then wrote code to reprocess images to fix them! Needless to say, we were impressed. What emerged was a wonderful dialogue between Briana, members of the science team, and the folk at Zooniverse. She’s made some large changes to our image processing pipeline and helped us all learn a lot about how to use Landsat for kelp in places *other* than California. As such, we asked Briana if she wanted to take her involvement to the next level, and join the Science Team. And we were delighted when she accepted! So, here are her comments on the awesome work she did and how our image processing has changed.
The first thing to do upon finding an interesting problem is to find out if anyone else has solved it already. So I searched for research in the areas of image analysis and coastlines and satellite imagery. The majority of the papers were far too detail oriented to be very helpful, the problems in tracking the month to month changes of the coastline of a small island are wildly different from sorting coast from non-coast for FF! But I did find a fascinating paper on using Landsat data to build a highly accurate waterline database for all of Europe. They clearly solved the problem of finding ocean coastline, and then went a lot further!
The technique they used was to take a cloudless mosaic of the region–lots of preprocessing there!– and separate the image into three regions, water and land, selected with simple pixel value thresholds, and unassigned pixels. They then ran a region growing algorithm to add the unassigned pixels to either area.
This was good find for me, because they’re solving a very similar problem, and I know how to implement both those things! Unfortunately region growing is relatively slow and expensive, and it probably wouldn’t play nice with cloudy images. I did more digging over the next week, without finding anything else that was more promising. So I sat down, and wrote a little program.
Simplicity is important when you’re working with a lot of data; if the running time of the algorithm is longer than a person would take to do the same task, something has gone horribly wrong! I went through a couple iterations on how to find water, but in the end, this is what I ended up with.
Water is any pixel where the red value is between 1 and 25. Water’s very dark in all the bands, but it’s darkest in red, so that’s the best way to find it. If we’re clever about it, we only need to read the pixel values once, and perform some simple math operations, which means it should hardly take longer than opening up the image to view it.
– Count all the pixels that are water.
– Count all the pixels that are black, value 0. This ensures it’s not biased to throw out images that are on the edges of the Landsat scene.
– Calculate the percentage of non-black pixels that are water.
– If that percentage is above a certain threshold, we’re good to go, keep this image. I picked 5% as the threshold, based on a little trial and error.
And that’s it! It by no means gets rid of ALL the non-coast images, for example this does absolutely nothing for the abundance partially cloudy ocean images. It also gets tripped up by dark shadows on land, either from clouds or mountains, as shadows are just dark enough to fall within that threshold. Lakes are also selected, if they’re big enough.
The more complicated part comes after algorithms are made and tested: building them into the existing image processing pipeline. I wrote my algorithm in Python, making use of a few key libraries to do all the image processing; the pipeline is in Ruby, and uses a tool call ImageMagick for its image processing. I’m good at programming Python, I’d never touched Ruby until working on this project! And ImageMagick does seem quite ‘magical’ to someone who hasn’t used it before.
After reducing the problem of non-coast images, there’s the problem of the dark and red images that are especially common in the Tasmania dataset. The red part has been solved, but the darkness is still there for a lot of images. I have more work to do! But for now, we can say goodbye to a big chunk of the non-coast images in the next data set. No more bright blue snow-capped mountains, or solid fluffy cloud tops, or endless squares of farmland.
I’ll see you on Talk!