Privacy and static asset hosting

grumpy-podmin · May 12, 2018, 2:42pm

I use Amazon S3 for static asset hosting. Ultimately that’s using fog-aws. I did a quick experiment. I made a fairly private post. It is shared with an aspect that has only one other user in it. (post is here: https://a.grumpy.world/posts/13996). I’m fairly confident that nobody can see the text in that post except the one user I shared it with. However, there is an image in that post. Through the magic of S3 static asset hosting, that image has been copied to here: (https://a-grumpy-world.s3.amazonaws.com/uploads/images/scaled_full_8dded91b832f939318cf.jpeg).

I want to use S3 because (a) I don’t want to preallocate storage (the hard drive on my instance will never fill up from user uploads) and (b) I want all the benefits of CDN and caching and all that stuff.

This result, though, is completely contrary to the privacy expectations of my users. They assume that if they post an image on a privately shared message, the image will have the same protections as the text. Unless I’ve misconfigured something in how I do S3 asset hosting, that is totally not happening. All images users post on my pod are given public URLs, no matter what aspects I selected on the post itself. A bad person can’t guess at the URL of the images. But if someone who has access to the post shares the URL of the image in the post, then that URL of the image is publicly accessible. It will work for anyone and everyone, They don’t even have to be part of the federation at all.

This makes CDNs for static asset hosting problematic, and I wonder if any of the static asset offload techniques have any different security semantics posted images. S3 offers security mechanisms. It offers a “presigned” URL that has a short expiration, like 5 seconds, but that would have to be coded down in fog, and then exposed somehow to Diaspora, which would have to be programmed to use it.

Diaspora is only enforcing permissions on requests that it serves itself. That makes intuitive sense, but the privacy implications of offloading uploaded assets to a CDN are not obvious at all.

supertux88 · May 12, 2018, 9:04pm

We can’t protect the users from themselves. When somebody copy-pastes an URL to an image to somebody who shouldn’t see it, there is nothing diaspora can do about it. It’s the same as when a user copy-pastes (or screenshot) a private post and sends it to somebody who shouldn’t see it.

Images aren’t federated (only the URLs are), they stay on your pod unless you activate the s3 feature. When the URL we federate only works 5 seconds that wouldn’t work, because the user on the other post maybe only sees the post two days later and the image should still be visible then.

grumpy-podmin · May 13, 2018, 9:46am

No, those things are not the same. One involves access control on the original content (something that is possible, though perhaps not done now) and the second is making an additional copy and then sharing it. It is obviously not possible to stop/regulate additional copies. But the request for the image comes to the pod. It absolutely is possible to apply some level of access control to it. I need to dig deeper into federation and how that works.

The result of all this is that all images assets are public. There’s no access control applied to them. The aspect concept merely regulates which users see an explicit announcement of posts in their timeline.

It isn’t the location of the images that surprises me. It’s the access control. You seem to be suggesting that all images posted to Diaspora get public URLs and there’s no access control at all on the actual image. It’s essentially security through obscurity. As long as only the right people know the URL to the image, only the right people will fetch it.

I see your point. I was too brief. One of the cheap and easy access control mechanisms for S3 is a presigned URL that takes some parameters and has an expiration date. Obviously the expiration can be far greater than 5 seconds. The design is that, just before sending the page to the user, the pod calls an S3 API to get a time-limited URL for the static asset, then it inserts that as the URL for the image into the HTML it returns to the browser.

As I talk this through, I see the problems inherent in any federated design. You’d have to federate an access token of some kind, or you have to have a way for the right people to request one on demand. If you share the access token with the image, it is pointless. If you don’t then there has to be some user-to-pod or pod-to-pod mechanism for requesting access tokens and granting them only to authorised receivers of the image. That is complicated.

I’m disappointed. There’s so little enforcement. It’s just privacy through obscurity. And that obscurity is trivially unmasked.