Diaspora cannot start on Ubuntu 20.04, bad identity

tek_dmn · September 18, 2021, 3:20am

Okay, so… this is a really weird one. I’m trying to follow the Ubuntu installation guide on the wiki with an Ubuntu 20.04 container (LXC), and it’s refusing to start for… the weirdest reason I’ve ever been given for an error, to date. Here’s the output from ./script/server (slightly trimmed for brevity reasons):

Starting Diaspora in production mode with 1 Sidekiq worker(s).

I, [diaspora:__default__] call: 
I, [diaspora:__default__] schedule :monitor (load by user)
I, [diaspora:__default__] => monitor  (load by user)
I, [diaspora:__default__] starting async with 0.2s chain monitor 
I, [diaspora:sidekiq] call: 
I, [diaspora:sidekiq] schedule :monitor (load by user)
I, [diaspora:sidekiq] => monitor  (load by user)
I, [diaspora:sidekiq] starting async with 0.2s chain monitor 
I, [diaspora:web] schedule :monitor (monitor by user)
I, [diaspora:web] => monitor  (monitor by user)
I, [diaspora:__default__] <= monitor
I, [Eye] <= loading: ["/home/diaspora/diaspora/config/eye.rb"]
I, [diaspora:sidekiq:sidekiq1] schedule :monitor (monitor by user)
I, [diaspora:sidekiq:sidekiq1] => monitor  (monitor by user)
W, [diaspora:web] compare_identity: fail, pid_file: '17 Sep 22:23', process: '26 Nov 10:10' (unicorn master -c config/unicorn.rb -D)
I, [diaspora:sidekiq] <= monitor
W, [diaspora:web] load_external_pid_file: process <10931> from pid_file failed check_identity
I, [diaspora:web] switch :starting [:unmonitored => :starting] monitor by user
I, [Eye] <= command: load /home/diaspora/diaspora/config/eye.rb (1.093484713s)
I, [diaspora:web] executing: `bin/bundle exec unicorn -c config/unicorn.rb -D` with start_timeout: 15.0s, start_grace: 2.5s, env: 'RAILS_ENV=production PORT=' (in /home/diaspora/diaspora)
I, [diaspora:sidekiq:sidekiq1] load_external_pid_file: pid_file not found
I, [diaspora:sidekiq:sidekiq1] switch :starting [:unmonitored => :starting] monitor by user
I, [diaspora:sidekiq:sidekiq1] daemonizing: `bin/bundle exec sidekiq` with start_grace: 2.5s, env: 'RAILS_ENV=production', <11156> (in /home/diaspora/diaspora)
I, [diaspora:sidekiq:sidekiq1] sleeping for :start_grace 2.5
I, [diaspora:web] sleeping for :start_grace 2.5
I, [diaspora:sidekiq:sidekiq1] switch :started [:starting => :up] monitor by user
I, [diaspora:sidekiq:sidekiq1] <= monitor
W, [diaspora:web] compare_identity: fail, pid_file: '17 Sep 22:23', process: '26 Nov 10:10' (unicorn master -c config/unicorn.rb -D)
W, [diaspora:web] load_external_pid_file: process <10931> from pid_file failed check_identity
E, [diaspora:web] exit status 1, process <> (from /home/diaspora/diaspora/tmp/pids/web.pid) bad_identity; this is really strange case, like timestamp of server was updated, may be need to reload eye (you should check the process logs ["/home/diaspora/diaspora/log/eye_processes_stdout.log", "/home/diaspora/diaspora/log/eye_processes_stderr.log"])
E, [diaspora:web] process <10931> failed to start (:bad_identity)
W, [diaspora:web] killing <10931> due to error
I, [diaspora:web] send_signal KILL to <10931>
I, [diaspora:web] switch :crashed [:starting => :down] monitor by user
I, [diaspora:web] schedule :check_crash (crashed)
I, [diaspora:web] <= monitor
I, [diaspora:web] => check_crash  (crashed)
W, [diaspora:web] check crashed: process is down
I, [diaspora:web] schedule :restore (crashed)
I, [diaspora:web] <= check_crash
I, [diaspora:web] => restore  (crashed)
I, [diaspora:web] load_external_pid_file: pid_file found, but process <10931> not found
I, [diaspora:web] switch :starting [:down => :starting] crashed
I, [diaspora:web] executing: `bin/bundle exec unicorn -c config/unicorn.rb -D` with start_timeout: 15.0s, start_grace: 2.5s, env: 'RAILS_ENV=production PORT=' (in /home/diaspora/diaspora)

Version: v0.7.15.0 (Git commit 1d0982822b0278525b4d5be881114ff0977ea9df).
Contents of eye_process_stdout.log: Nothing, besides a deprecation warning about Sidekiq::Web.sessions=.
Contents of eye_process_stderr.log: Also nothing. Redis#exists(key) return type warning, but that’s, as far as I can tell, irrelevant.

Searching around, literally the only other piece of information I can find is this thread with a rather unsatisfactory conclusion.

So, what else can I try and gather from what resources I have?

The first actual error line in the script output is this:

[diaspora:web] exit status 0, process <11162> (from /home/diaspora/diaspora/tmp/pids/web.pid) bad_identity; this is really strange case, like timestamp of server was updated, may be need to reload eye (you should check the process logs ["/home/diaspora/diaspora/log/eye_processes_stdout.log", "/home/diaspora/diaspora/log/eye_processes_stderr.log"])

Followed by the process being killed for the reason bad_identity. Looking just above, we can see why it’s doing that:

compare_identity: fail, pid_file: '17 Sep 23:10', process: '26 Nov 10:57' (unicorn master -c config/unicorn.rb -D)

I can tell that the pid_file timestamp is the modification time of the PID file specified in the diaspora.toml config. And that lines up well with the server time, according to date: Fri Sep 17 23:12:24 EDT 2021. So apparently the process itself has a timestamp (somehow?) of over two months in the future and this discrepancy is causing it to be killed.

Anyone have any ideas?

supertux88 · September 18, 2021, 3:56am

Eye doesn’t work with LXC because there is a bug with kostya-sigar (a library eye uses). Something how LXC handles uptime/boot-time/process-uptime doesn’t work which then leads to this problem (I think the 2 months difference is your uptime or something like that, I don’t remember the details).

You have two options now: don’t use LXC, or don’t use eye (script/server) to start diaspora and start diaspora with something like systemd: Alternative startup methods - diaspora* project wiki

All eye does is, it starts unicorn and sidekiq and monitors the processes in the case they crash and restarts them, systemd can do the same thing, or you can just start bin/bundle exec unicorn -c config/unicorn.rb -E production and bin/bundle exec sidekiq with something else (you maybe lose the monitoring and restarting of the processes if your method doesn’t support that)

tek_dmn · September 18, 2021, 4:28am

My container uptime is 24 minutes as of this comment, though the hypervisor is sitting at… 69 days, 16 hours. 69 days in the future is… Friday, November 26th. So it’s the hypervisor uptime for… okay yeah that’s just plain weird.

I planned on starting using systemd once I was out of the testing and setup phase (configuring HTTP routing, making sure it’s stable, the usual setup stuff). I would imagine that it might be somewhat helpful to note this on the wiki somewhere unless I’m just completely blind and missed that (which I am unfortunately known to do on occasion).

Thank’s for the quick response though, let’s take a look.

tek_dmn · September 18, 2021, 4:42am

Indeed, works just fine with systemd. (Doesn’t work when I accidentally tell NGINX to accept HTTP/2 and don’t tell HAProxy to also send HTTP/2, but it wouldn’t be right if you’re configuring something without at least one boneheaded mistake, right?)

On a side note, that’s an interesting way to structure the systemd definitions. 99% of other apps just have a unit file that points to a “start everything” script, actually separating out the components and grouping them with a target is… actually rather well executed.