Language filter or automated translations

Continuing the discussion from Language filter for diaspora - as a gsoc project:
Wondering what the status of this thread is? Do you guys have a language detection/filtering implemented yet?

1 Like

As a newer user, I’ve been diving into all of the settings. I’m also inviting friends to diaspora* that are not very techie. And, I think that of all of the little features I would like tweaked or added, a language filter is by far number one right now. It would greatly improve the user experience and help retain and engage new users.

So, just wondering, since this is an older thread, is language detection/filtering something that is in the works?

It is not. It’s one of those features where people like to claim “it’s easy, just do X” and ask “why is this not done yet??”, but reality looks different. There are a lot of issues with any potential idea anyone comes up with. The summary is:

  • If we ask the user to specify the language per profile, we’ll screw with people who post in multiple language (I, for example, post in German and English)
  • If we ask the user to specify the language per post, we’d increase a lot of friction in creating a post
  • If we’d make specifying a language per post, but optional, we’d run into a situation where most people probably would not fill out the field, making the whole language filer idea useless again
  • If we depend on auto-detecting the language, we’d run into issues with many things, most significantly
    • There is no really good library to do this - most things we could package work reliably on larger texts, but have a high rate of uncertainty in shorter posts.
    • If we’d depend on external services, then well, we’d be forced to send posts to external services. If we’d make that optional, then again, we’d make the whole feature useless again.

We could offer an automatic translation button , but then again, that has huge implications. Me personally, I’d would pretty much dislike to see a transmission of private posts to external services.

The issue why people run into language-issues the first place is the way diaspora wants to handle content discovery. Instead of using network-effects, we just allow people to use hashtags to mark and discover content. Tags, especially tags that contain predefined terms or product names/trademarks generally are not language specific, so people end up subscribing to posts they don’t understand. This is, indeed, an issue, but I’m still not convinced that language filtering is a good solution.

1 Like

Perhaps this might be a good resource: http://fosmt.org/

Since diaspora has many different users who speak many different languages, I find myself not being able to understand many posts.

Built-in translation could be diaspora’s “killer app”


Note: This discussion was imported from Loomio. Click here to view the original discussion.

1 Like

Have you searched for previous issues regarding this? Because there are some.

I did a cursory search but found nothing. I’ll be more thorough in the future.

Keep in mind that Loomio’s search facility only returns results from the sub-groups that you have joined (which is slightly annoying). So if you’re not a member of sub-groups in which such a proposal is likely to have been posted, such as ‘Feature Proposals’, you’d need to look manually through the threads in that sub-group to find any previous discussion (which is slightly more annoying!).

Okay, thanks for the heads up!

I totally agree on this and maybe Loomio should be top priority to have this implemented, for democratic reasons. :slight_smile:
I’d say this function should avoid Google Translate by having the pods handle the translations, for privacy reasons.

Until then there is browser plug-ins like Translate This and Instant Translate. The best option is always to learn the language, Instant Translate could help in this with its double click translate (needs to be enabled).

Could this somehow be moved to Feature Proposals?

Most of those browser add-ons still use Google Translate, unfortunately. And when I say ‘most’, I mean ‘good luck finding one that doesn’t.’ Instant Translate, which you mentioned, appears to be closed-source. I would love to see a browser extension that uses Apertium.

This is what I’ve been using. And it seems to work alright. The reviews report a lot of odd behavior, and the possibility that it does not respect privacy. Personally, I have not noticed any of these issues. However, you could chalk that up to the fact that I’m not on Windows, and have fairly heavy privacy controls.

While I fully support the idea of auto-translating on Diaspora*, it really seems like a moonshot kind of effort. I want it eventually, but adding translation to all the other things pods are doing would be computationally expensive.

How about supporting multi-lingual posts? The one posting the message could do the translation himself, picking his target audience and removing errors by hand.
I think there was the proposal for specific language tags as well, which would fit into this.
This way the pod could eventually do the remaining translations during otherwise idle time (with some settings for the podmin). The order of translation could be based on anonymous statistics on network wide language preferences and post time.

2 Likes

I really like the idea of user specifying the language of the post during it’s creation. The obvious advantage of this would be –

Filtering - ability to filter non-english posts would be brilliant, even with the best translation services available the post quality would be questionable at best and promoting some sort of discussion through translated text would be difficult to say the least.

I think translations would be a huge undertaking and simple feature like the one described above could be a really quick fix that would increase the user experience tremendously. Seriously /public is completely impossible to browse and good luck having a healthy social community when people can’t discover anyone to communicate with…

Edit: I’ve made a little mockup to illustrate this:
mockup

3 Likes

Coming from Google+, I was a heavy user of the “translate” button. I’d love to see something similar here.

I realize this is an old post, but it appears to be the most appropriate one to put my thoughts about language. I speak and read English and French. When I look at public activity on D*, I see tons of posts in German. Many of the postersd also post in English. I was wondering what everyone who cares thinks about this concept:

If posters used the standard hashtags for language (en, de, fr, es) voluntarily, they would indicate what language the post was in. Assuming that was done by even half of them, it would still be a help.

At that point, if code was added to detect the hashtag, check of it was in the user’s settings and NOT show it if it wasn’t on the list of “accepted” languages, I could watch the public stream. Unlike other “ignore hashtag” suggestions, I would propose that this NOT show the post, even if it’s by someone you follow.

The goal is being able to see only posts you can read, thereby removing the distraction of tons of posts you can not read. Anyone think this is a reasonable way to do it? I believe translation should be done by the person who wants it, unless some easy way to do it is found.

1 Like

If posters used the standard hashtags for language (en, de, fr, es) voluntarily, they would indicate what language the post was in.

It would be a good first step, in fact some already do this. It is impossible to ignore tags for now sadly but it is possible to follow all posts in your language which is helpful sometimes.

I would propose this mechanism for getting around language barriers without breaking logic of Diaspora:

  1. Languages are tagged with specific tags, e.g. #lang_en, #lang_es, #lang_ru

  2. Hashtag filtering is introduced so people can ignore languages they don’t understand but encounter often.

  3. Two user options are added: “user language” and “followed languages”. The first one specifies languages user writes posts in and tags user posts with one/some of them. If there is one language selected then all posts are tagged, if there are multiple then user can choose them via dropdown when posting. “Followed languages” are languages to follow posts in. Languages not in this list will be filtered from the stream. The list includes “other” which allows to view/ignore untagged posts.

  4. At this point manual language tag entry will no longer be necessary and language tags can be hidden from display and used only under the hood.

The advantage of this approach is that it can be introduced gradually and relies on existing (or hopefully soon to be added) logic of Diaspora. It does not need language detection which can be buggy. User has more control (which includes power to disable any filtering whatsoever).

1 Like

I agree and that is pretty much what I imagined. Declare the languages you want to see and the one you most often post in, which would add the hashtag of your language. The logic isn’t expensive to code (IMO) but maybe the user interface is. Having some kind of language arrangement would add a lot of Diaspora. Right now, I just manually pass by posts in German, but it’s a lot of clicks or spacebars.

The best way to remove lang barriers is learning English. For those which wish to understand a non English discussion or not in their mother lang, could use the translators online. It’s the thing I do.

The ability to talk in your own language is very important. As soon you have to talk in another language, you will run into issues of expressing yourself the way you intend to.
I think an inbuilt translation tool is very valuable. It takes away the need of cumbersome measures to translate texts.
Starting with a language declaration would be a good starting point to have auto translation in a later stage.

1 Like

Could language declaration be tags in the profile? Or were you thinking of something else?