Implementation specific fields?

michaelvogel · April 2, 2017, 10:57am

Especially with the “social relay” functionality it can make sense to transmit additional data fields that are only relevant for that specific implementation.

I don’t speak about fields like the ones for events or threaded comments. (These ones are useful for Diaspora as well)

On Friendica the unique identifier is the URI in a format like:
urn:X-dfrn:squeet.me:6:962c3e1035ce3f88030ffce2e624407b021bda5b

I would like to transmit this field as well as fields for the title and the body in our own BBCode format.

What do you think? Is it okay? Are the fields okay? How could they be named?

supertux88 · April 2, 2017, 8:44pm

The protocol is extendable, so you can theoretically add whatever you want. But make sure to make them unique, so you don’t have problems if we maybe add a similar field in the future as “official”. So maybe prefix them with dfrn_ or friendica_ or something like that.

I also thought about adding a title field for diaspora, but I don’t know if we really need it. Currently we auto-detect the title when displaying the post. But maybe we can split it before sending it (or even give the user control to set the title, but that would be another discussion). But you could simply use such field, unless your title contains BBCode.

About BBCode: For posts that isn’t a problem, diaspora just ignores it and throws it away on receive. For comments I don’t like the idea of also add the BBCode version, because diaspora needs to also store it, for the signature, so it somehow doubles the space needed to store in the database (because we need to store the text twice).

So as simple rule:

You can add everything you want to normal entities (diaspora will simply ignore/drop it), just make sure you don’t get collisions in the future.
For relayable-entities: Add fields only if you really need it, because diaspora needs to save the content (for the signatures), so please don’t be wasteful here But if you need it, it wouldn’t be a problem, and if it is useful for others we can maybe even add it to the documentation.

And something slightly off-topic at the end (but since you mentioned BBCode, I want to at least mention it):

I thought about if it would be possible to add Markdown/CommonMark support to Friendica? (Additional to BBCode, because users probably don’t like a hard switch?) So you probably would need a flag if a post was received/created as Markdown or BBCode in the database. Rendering clientside with markdown-it works pretty well at diaspora (so you could probably use the same). And since I use Markdown pretty much everywhere in the internet, it’s maybe also a cool feature for friendica users to use it to write posts? And I think that would help a lot when you don’t need to convert every post between BBCode and Markdown, and with markdown-it it shouldn’t be that hard to display diaspora-posts in the original Markdown format (so you can drop the Markdown to BBCode conversation already). So I know, that that isn’t a quick change (and maybe it isn’t even possible because something blocks it, I don’t know much about friendica), but I just wanted to mention the idea here and you can maybe think about it?

michaelvogel · April 3, 2017, 4:37am

Markdown-support for Friendica would be a problem at the moment. We are currently using BBCode not only for formatting (which would be okay) but also for the structure. That means that reshares and events are currently done via some BBCode tag. (I know that this is bad)

But I don’t see the huge advantage in completely switching to Markdown since there always will be some compatibility problems, caused by the different Mardown parsers. We are already are using standard parsers from Markdown to HTML and vice versa that are connected with a BBCode to HTML parser and some HTML to BBCode parser.

So I think as a first step I would add the field “dfrn_uri” to both post and comment if you don’t disagree.

svbergerem · April 3, 2017, 11:13am

Most problems can be solved by using a parser following the CommonMark spec.

supertux88 · April 3, 2017, 8:17pm

For posts absolutely now objections , for comments I’m not sure It looks like that most parts can be reconstructed from the other data, but I don’t know what the “6” means between the pod-hostname and the guid? So if it’s important for you, go for it!

But that means that every post that simply contains text/links/images theoretically can be done with Markdown?

Don’t get me wrong, I know that that’s a big change, and it’s not done quickly (and probably there is other stuff that needs to be done first, to make it possible), but I think in long term it can make things easier. And while converting everything from Markdown to HTML and then to BBCode (or the other way around) and there are many things that can go wrong here, and just using and displaying the original Markdown from diaspora shouldn’t do any problems. And as @svbergerem mentioned, CommonMark is a spec, so there shouldn’t be any compatibility problems.

So even when it’s not possible right now, you can maybe keep that in mind when doing new things or refactorings, so it’s maybe possible sometimes in the future?

michaelvogel · April 4, 2017, 8:29pm

Besides reshares and events there are also links. While Diaspora simply does an OEmbed request for the first link, we include this data in the post. Maybe it would be an idea to add these fields to a post as well?

supertux88 · April 4, 2017, 8:54pm

Sounds like a good idea. There is always the feature request to suppress the embed or select another link, and with this in the protocol this would be easier in the future.

Diaspora does OEmbed or OpenGraph, so how to name such field? And we should made “first” the default for when receiving posts from older pods without that field (at least for the beginning) so it should contain an empty field if you want to suppress the embed-detection. Or maybe we should make “first” and “none” a valid value?

michaelvogel · April 5, 2017, 8:57pm

I would suggest something like this:

<status_message>
  <author>alice@example.org</author>
  <guid>17418fb029e6013487743131731751e9</guid>
  <created_at>2016-07-11T22:38:19Z</created_at>
  <text>I've been totally shocked by the article that I just read!</text>
  <embed>
    <link>http://www.huffingtonpost.co.uk/2015/06/01/9-signs-cats-are-planning-world-domination_n_7482898.html</link>
    <title>Cats Are Conspiring To Take Over The World, And Here’s Proof</title>
    <description>It's fairly obvious to anyone who has been near a cat for a prolonged period of time that the little felines are clearly plotting to take over the world....</description>
    <image>http://i.huffpost.com/gen/3016630/images/o-CATS-PLOTTING-WORLD-DOMINATION-facebook.jpg</image>
  </embed>
  <public>true</public>
</status_message>

When only the sender would do the OEmbed call, this would significantly reduce network load, I guess. And in a future version people could even chose the best preview image (like already possible at Facebook, Google+ and Friendica)

I think it would be best to simply have <embed></embed> in the message to suppress the automatic detection.

supertux88 · April 6, 2017, 12:56am

I think you are mixing OEmbed and OpenGraph. For OpenGraph we could indeed include link, title, description and image, but OEmbed contains the html code that is embedded, and I would only fetch that from the trusted OEmbed endpoint and not receive that via federation.

But for OEmbed we can simply use <embed><link>https://foo.bar/</link></embed> then.

So an empty element would also disable the detection?

michaelvogel · April 6, 2017, 5:45am

Yeah, concerning OpenGraph vs. OEmbed: You are right concerning rich content. Of course this mustn’t be transported via federation since that would be a security problem.

I would recommend to transport all the fields at any time. A system could be configured to not do any OEmbed calls at all. (Friendica has a setting not to embed rich content to prevent tracking).

So the way would be that a system still does an OEmbed call at the reception of every message with the <link> element - but no OpenGraph call. That would be done when the message was created.

I don’t think that there is a need to further tell a remote system to not do an OEmbed call at all, do you agree?

supertux88 · April 6, 2017, 8:55am

There are users who wish that their posts don’t embed a link at all, so they add a empty/invalid link at the beginning (to try to break the auto detection), so when we add a possibility to select which link is embedded, we should also add the case to select “no link”.

michaelvogel · April 6, 2017, 9:20pm

“no link” would be this <embed></embed>. What I meant was the difference between OpenGraph data and OEmbed (rich) data.

supertux88 · April 6, 2017, 9:27pm

ah, ok sounds good to me, so what do others think?

flaburgan · April 9, 2017, 6:04pm

How much do we care about messages size? Wouldn’t be a simple number (1 for the first link, 2 for the second one, 0 if we don’t want the preview) enough, and we let the pod fetch the data? So it can do it how ever it wants?

flaburgan · April 9, 2017, 6:06pm

About the initial topic, [quote=“michaelvogel, post:1, topic:668”]
Especially with the “social relay” functionality it can make sense to transmit additional data fields that are only relevant for that specific implementation.

On Friendica the unique identifier is the URI in a format like:
urn:X-dfrn:squeet.me:6:962c3e1035ce3f88030ffce2e624407b021bda5b
[/quote]

I still consider the relay kind of a hack. I would really love to have a solution to be able to subscribe to public posts inside the protocol itself.

michaelvogel · April 9, 2017, 7:30pm

This sounds like a bad hack for me.

supertux88 · April 21, 2017, 7:19pm

As already written: The only point where I care about message size is for relayables, because we need to save all fields to recreate a valid signature in the future. For non-relayables the size isn’t important (as long as we don’t use it for filesharing with big files in base64 ).

So we can still ignore the data in the message and request OpenGraph for the URL in link ourselves, if we want to.

But even then it maybe makes sense to add fields not used by diaspora

michaelvogel · April 23, 2017, 1:29pm

BTW: What about namespaces? Wouldn’t <dfrn:uri>urn:X-dfrn:squeet.me:6:962c3e1035ce3f88030ffce2e624407b021bda5b</dfrn:uri> look better?

supertux88 · April 23, 2017, 3:14pm

Hmm, my ruby DSL (to describe entities) can’t handle : in fieldnames, so that’s a small problem (Because I map fieldnames to attribute/method names and that doesn’t work with :)

But as long as we only ignore dfrn:uri that wouldn’t be a problem (and additional fields in relayables are handled as plain strings, so this isn’t a problem either).