Language filter or automated translations

I agree with what @jonnehass said.
As the initial step the focus should be on detecting the language. That is not a small feature by itself.

Meanwhile we can keep thinking about a way to implement translation that is practical, but also aligns with security and privacy concerns that form the base of what Diaspora stands for.

Those are two separate features and should not be munged into one task.

1 Like

Hey,
Yes, the initial focus of this GSoC project will be the implementation of language detection and tagging relevant posts(as planned in the previous comments). However there will be a parallel discussion on how to integrate the translation feature as well.

I’m against a automatic translation through a third party API.
If I’m interested, I can always paste the content into translate.google.com on myself without sending data to it. I dislike the exposing …

Hi, is there any news on this project ? I feel like it would be very useful to be able to filter posts that are not comprehensible to the user.

1 Like

How about allowing user to specify known languages in their profile so at least it could be used to exclude from the streams the posts of the other users who don’t have any language in common.

For example if I choose english and french and someone else has defined english and italian, I might still see posts in intalian from that person if we use the same hastag, but at least I wouldn’t see posts from somebody who has defined dutch, german and spanish as favorite language.

This is far from being perfect (unless you only select one language), but still it would improve the streams without using a third party API (like google translate) which is a problem for some people and I suppose it shouldn’t be a very complex development ?

I would like to have option to filter languages other than Finnish and English as I don’t understand other languages.

Currently I can unaspect people who mainly write in languages I don’t understand, but there are still followed hashtags which aren’t restricted to one specific language as either the same word exists in both languages or everyone just uses the word instead of whatever the word for their language is.

Continuing the discussion from Language filter for diaspora - as a gsoc project:
Wondering what the status of this thread is? Do you guys have a language detection/filtering implemented yet?

1 Like

As a newer user, I’ve been diving into all of the settings. I’m also inviting friends to diaspora* that are not very techie. And, I think that of all of the little features I would like tweaked or added, a language filter is by far number one right now. It would greatly improve the user experience and help retain and engage new users.

So, just wondering, since this is an older thread, is language detection/filtering something that is in the works?

It is not. It’s one of those features where people like to claim “it’s easy, just do X” and ask “why is this not done yet??”, but reality looks different. There are a lot of issues with any potential idea anyone comes up with. The summary is:

  • If we ask the user to specify the language per profile, we’ll screw with people who post in multiple language (I, for example, post in German and English)
  • If we ask the user to specify the language per post, we’d increase a lot of friction in creating a post
  • If we’d make specifying a language per post, but optional, we’d run into a situation where most people probably would not fill out the field, making the whole language filer idea useless again
  • If we depend on auto-detecting the language, we’d run into issues with many things, most significantly
    • There is no really good library to do this - most things we could package work reliably on larger texts, but have a high rate of uncertainty in shorter posts.
    • If we’d depend on external services, then well, we’d be forced to send posts to external services. If we’d make that optional, then again, we’d make the whole feature useless again.

We could offer an automatic translation button , but then again, that has huge implications. Me personally, I’d would pretty much dislike to see a transmission of private posts to external services.

The issue why people run into language-issues the first place is the way diaspora wants to handle content discovery. Instead of using network-effects, we just allow people to use hashtags to mark and discover content. Tags, especially tags that contain predefined terms or product names/trademarks generally are not language specific, so people end up subscribing to posts they don’t understand. This is, indeed, an issue, but I’m still not convinced that language filtering is a good solution.

1 Like

Perhaps this might be a good resource: http://fosmt.org/

Since diaspora has many different users who speak many different languages, I find myself not being able to understand many posts.

Built-in translation could be diaspora’s “killer app”


Note: This discussion was imported from Loomio. Click here to view the original discussion.

1 Like

Have you searched for previous issues regarding this? Because there are some.

I did a cursory search but found nothing. I’ll be more thorough in the future.

Keep in mind that Loomio’s search facility only returns results from the sub-groups that you have joined (which is slightly annoying). So if you’re not a member of sub-groups in which such a proposal is likely to have been posted, such as ‘Feature Proposals’, you’d need to look manually through the threads in that sub-group to find any previous discussion (which is slightly more annoying!).

Okay, thanks for the heads up!

I totally agree on this and maybe Loomio should be top priority to have this implemented, for democratic reasons. :slight_smile:
I’d say this function should avoid Google Translate by having the pods handle the translations, for privacy reasons.

Until then there is browser plug-ins like Translate This and Instant Translate. The best option is always to learn the language, Instant Translate could help in this with its double click translate (needs to be enabled).

Could this somehow be moved to Feature Proposals?

Most of those browser add-ons still use Google Translate, unfortunately. And when I say ‘most’, I mean ‘good luck finding one that doesn’t.’ Instant Translate, which you mentioned, appears to be closed-source. I would love to see a browser extension that uses Apertium.

This is what I’ve been using. And it seems to work alright. The reviews report a lot of odd behavior, and the possibility that it does not respect privacy. Personally, I have not noticed any of these issues. However, you could chalk that up to the fact that I’m not on Windows, and have fairly heavy privacy controls.

While I fully support the idea of auto-translating on Diaspora*, it really seems like a moonshot kind of effort. I want it eventually, but adding translation to all the other things pods are doing would be computationally expensive.

How about supporting multi-lingual posts? The one posting the message could do the translation himself, picking his target audience and removing errors by hand.
I think there was the proposal for specific language tags as well, which would fit into this.
This way the pod could eventually do the remaining translations during otherwise idle time (with some settings for the podmin). The order of translation could be based on anonymous statistics on network wide language preferences and post time.

2 Likes

I really like the idea of user specifying the language of the post during it’s creation. The obvious advantage of this would be –

Filtering - ability to filter non-english posts would be brilliant, even with the best translation services available the post quality would be questionable at best and promoting some sort of discussion through translated text would be difficult to say the least.

I think translations would be a huge undertaking and simple feature like the one described above could be a really quick fix that would increase the user experience tremendously. Seriously /public is completely impossible to browse and good luck having a healthy social community when people can’t discover anyone to communicate with…

Edit: I’ve made a little mockup to illustrate this:
mockup

3 Likes

Coming from Google+, I was a heavy user of the “translate” button. I’d love to see something similar here.