Backup and restore – account migration

TBH, I don’t see any technical problems on posts restore. Everything seems to me pretty straightforward. We just add the posts to the database of the backup pod if they aren’t there yet. Maybe I miss something?

On contrary, not restoring posts would lead to weird situations, like when after you’ve moved some of your contacts does comment on an old post of you. Comment gets federated to your new pod, but parent post is not there! Because we haven’t merged it.

So, I’ve now cleaned and published the spec for final discussion, cleanup and voting.

@comradesenya I left in your good idea to retry moved messages (which a small change to play with status codes instead of sending more messages around). But I removed the pointer regarding posts and comments and filed it as an issue instead.

There are also quite a few TODO’s in the spec (will log issues out of these) - and I’m pretty sure there are in general things that need fixing.

All in all, no one has given strong opposition to the idea itself, so hoping we can lock down a spec 1.0, approve it for implementing in diaspora* and then it can be implemented (hopefully by @comradesenya ;)).

So basically;

  • The spec is written as markdown and should be worked on via preferably issues and pull requests
  • I chose Gitlab because not everybody likes github and because it is nice and because TheFederation was not available in Github :wink: You can create an account using your Github login easily. For those who want to participate but don’t like Gitlab, please feel free to use other ways to communicate or even send patch files if you want.
  • The spec is “owned” by The Federation, or more like namespaced. I would like to keep editorship until 1.0 at least though.
  • The spec was generalized into diaspora* like federated social networks, not just diaspora*. I like the idea of being able to move across platforms too. Platform specific stuff doesn’t belong to the spec but should be left to implementation.

The spec is live as a working draft version 0.1.0 here: https://the-federation.info/specs/backup-restore/

The git repository is here: https://gitlab.com/TheFederation/backup-restore

Sorry this took a little while. Ping also @jhass as you’ve commented quite a bit on this before in github.

There is a thing we haven’t discussed before.

If the pod from which a person has moved to some new place is still alive, shouldn’t we block registration for that name on an old pod at least for a while?

  1. If a user has moved, and some new user registers on the old pod with the same name as the previous user had, then it could be possible, that somebody will try to discover the old person by the handle or even send a message

  2. It is theoretically possible, that somewhere exists a pod, which has discovered a person, but the new pod didn’t know about that pod, so “I’ve moved” message wasn’t sent to it. In this case discovery with webfinger could return 302 code and “I’ve moved” message as a body so that the pod could process this message instantly.

  3. At first after the feature is introduced in the diaspora* source code, it will be usual that there are many pods that are not up-to-date and they won’t get “I’ve moved” message in time. So, probably, it would be nice to block registrations with the same handles as some moved users had, to lower inconsistencies produced on the network.

  4. At first I thought about making kinda blocking period for a handle before it could be available again. However, I think, that we could consider user handles as an inexhaustible resource, so probably we don’t need to unblock them authomatically after a period someone has moved. Maybe we could introduce an action at the podmin page to free the account that was previously occupied by someone who had moved with statistics of the discovery requests for this old account shown. Then, the podmin could free this account on request by user like “Hi! Could you please make the handle of a previously moved person available for registration again?”.

TODO: To avoid an extra endpoint, should we use a version of NodeInfo instead?

@jhass had strong objection against it, so probably his opinion is to be considered.

TODO: Define full schema of content or place content keys in the first level.

I guess it’s fine if I base the content schema on this?

If a user has moved, and some new user registers on the old pod with the same name as the previous user had, then it could be possible, that somebody will try to discover the old person by the handle or even send a message

We could add a recommendation to the spec that old local usernames are reserved permanently - that would make sense since that is what happens when an account is deleted in diaspora afaik (though would have to check to make sure). I’ve made an issue.

But really, the keys will have been regenerated so there isn’t really a possibility of hijacking an identity like this.

It is theoretically possible, that somewhere exists a pod, which has discovered a person, but the new pod didn’t know about that pod, so “I’ve moved” message wasn’t sent to it. In this case discovery with webfinger could return 302 code and “I’ve moved” message as a body so that the pod could process this message instantly.

You mean store the whole “I’ve moved” message payload on the old pod and respond to webfinger queries made towards the old closed identity. That sounds good otherwise but to make it work webfinger would have to pick up the message as a response. I think that might be a bit outside spec?

3 and 4 relate to 1 - where I think we should just recommend reserving forever.

TODO: To avoid an extra endpoint, should we use a version of NodeInfo instead?
@jhass had strong objection against it, so probably his opinion is to be considered.

Well, I didn’t mean the NodeInfo but a NodeInfo :stuck_out_tongue: But, I think it is out of scope of this spec anyway, better keep it simple with a dedicated clean endpoint.

I guess it’s fine if I base the content schema on this?

Yes, but generalized for the spec, so we shouldn’t include all the keys but a generic set. diaspora* can of course implement a larger set containing the full amount of data used in our profiles.

Yes, but generalized for the spec, so we shouldn’t include all the keys but a generic set.

If you truely want to support “all” networks, you have to define a minimal set of keys. Otherwise, diaspora may use username where other networks might use user and everything becomes a complete mess.

I believe we may define a minimal set of keys as required and everything else as optional, so other networks pick what they would like to support.

Defining a single all-around schema with strict keys is hard, especially as we ourselves use pretty unique names like “aspects”. Making our export not use our own key names just to support this would be odd.

What about aliases? The spec defines the basic expected keys using the most common key names that would be expected in a generic spec, and then we define a set of aliases, for example aspects could map to something that might be more generic like contact_groups.

These aliases can be expanded as seen fit in future spec versions.

In the PR I introduced format by generating a schema basing on our exported archive json document using http://jsonschema.net/

Then I manually edited it to fix some issues and I removed most of the keys from the “required” section, leaving
"name",
“private_key”,
“profile”,
“contacts”.

The latter two - “profile” and “contacts” beign objects themselves doesn’t have required fields though. So we can either make them optional as well, or make some of their contents required.

Defining a single all-around schema with strict keys is hard, especially as we ourselves use pretty unique names like “aspects”.

Ah c’mon. What do you want to achieve? Do you want something that looks fancy but nobody is able to use between implementations or do you want to have a spec that defines a standard way to exchange account data?

If you want a spec, then define keys. Define username, contacts, groups, birthday and stuff. No alias. One key has exactly one descriptor, no exceptions. How implementations call these things internally is completely irrelevant to your spec, since you should define a data format, not a set of rules on how social networks should behave internally. Diaspora may put diaspora_id in the username field, Friendca my call the same database column lookhowfancytheuseriscalled. The spec does not care. And the spec should not care.

@dennisschubert so how do we use our JSON exports to restore then as they would be incompatible - or are you saying the diaspora JSON exports would be renamed?

If you write a spec that is well-written and suitable for diasporas need (that is, we can somehow export all fields), I’m sure it’s an easy task adapting the spec…

Think about the other side, the people implementing specs. What would you do if you see a spec that literally says "username contains the users name. Be aware that this field also may be called profile_name, name or fancyfancy". How are you supposed to implement that? Rolling dices while implementing specs is surely not a good thing.

Do we follow the federation protocol for the “Delivery package” and for the “I’ve moved” messages? If so, we must use XML instead of JSON.

Also, if we reuse federation protocol for these messages, we don’t have to define signing methods in our spec, since signing is already implemented on the salmon level.

But there is a problem, if we want to rely on the federation protocol in the spec, we must refer to the federation protocol spec to stay formal, but AFAIK there is no formal specification document for the federation protocol.

And in this case “Backup server receive route” is the usual public receive route for the server.

I don’t think we should mix this and diaspora federation together. At least the spec can’t be written like that.

More comments tomorrow, zzzz…

Well, I don’t know, that is exactly how we can reuse endpoints and signing code also. I feel that this is optimal to reuse the federation protocol, but not to invent one more way to exchange messages. To write the spec we’ll need some formal definition of the federation protocol then. At least some short version.

@dennisschubert, what do you think?